Wednesday, April 11, 2012

FSI: Top Two Tips

FamilySearch Indexing Field Help is located in the bottom-right corner
Field Help is located in the bottom-right corner

As I have helped new indexers this past week, I’ve had to teach them two things right off the start.

Field Help

The first thing you need to remember is that there are answers to your questions in the Field Help window pane. What do you do when there is no household number? What do you do when there is no surname? What do you type in the Titles field?

These questions and more are answered in Field Helps. Located in the bottom-right-hand part of the screen, the help content changes as you move from field to field. When you have a question, check there first.

Aligning Highlights

One aspect of FamilySearch Indexing (FSI) that makes it easy to use is field highlighting. As you index the page, the information to be indexed is highlighted in blue on the form.

Blue highlights the information to index

For 1940 Census batches, FSI often highlights the wrong spot on the form.

To fix the highlighting, click on the View menu, then on Adjust Highlights. This puts FSI into Adjust Highlighting Mode.

Select the View menu, then click on Adjust Highlights

Initially, it looks like nothing has happened. Move your mouse over the names on the census and the highlighting frame shows up. Drag the upper-left corner to the spot shown below:

Align the upper-left corner of the frame

Scroll down to line 40. Move the mouse until the frame shows up. Drag the lower-left corner to the spot shown:

Align the lower-left corner of the frame

Scroll to the top of column 19. Drag the upper-right corner to the spot shown:

Align the upper-right corner of the frame

Scroll to the bottom of column 19 and drag the lower-right corner to the spot shown:

Align the lower-right corner of the frame

Finally, turn off Adjust Highlights Mode by selecting the View menu and again clicking Adjust Highlights.

Now as you go about indexing, you will always know where to find the information to index.

Social Security Death Index Redactions

In recent months Ancestry.com and FamilySearch have taken steps to make up for the failure of the Internal Revenue Service (IRS) to utilize the Social Security Death Index (SSDI) to prevent tax fraud. “Social Security Death Index” is a name created by the genealogical community. The government calls it the Public Death Master File (DMF).

Redactions

Ancestry.com removed the SSDI from RootsWeb.com and redacted (removed) social security numbers (SSNs)  for individuals who have died in the last ten years. FamilySearch redacted SSNs for those who have died in the last two years.

It is unfortunate that Ancestry.com chose to remove the SSDI from RootsWeb. Not only was it free, but it was the best SSDI search experience on the web. While removing free access will be viewed as nefarious by some, the thinking was that thieves are less likely to pay for a subscription.

This is an example of an individual that died within the last ten years. The SSN is removed:

SSDI entry on Ancestry.com for Gordon B. Hinckley, former president of the Church of Jesus Christ of Latter-day Saints

This example shows where and how the SSN is normally displayed:

SSDI entry on Ancestry.com for Howard W Hunter, former president of the Church of Jesus Christ of Latter-day Saints

FamilySearch redacted the SSN for the past two years. While Ancestry hides the entire line for blank fields, FamilySearch.org shows the field but leaves the value blank:

SSDI entry on FamilySearch.org for Betty Ford, wife of the former president

Valuable Genealogical Records

The SSDI is important because the United States lacks nationwide registration of vital events: birth, marriage, and death. These are recorded at the local level by state, county, and city governments. Just as its name suggests, the SSDI serves as an index. While the SSDI contains second-hand information, it serves as an index leading to local jurisdictions where birth and death certificates might be obtained.

The SSDI is also an index to individuals’ applications for social security numbers, form SS-5. The SS-5 is rich in genealogical information. It contains birth names, birth information, parents’ birth names, and mailing address. According to the current fee table, it is $2 cheaper ($27 vs. $29) to obtain a deceased person’s SS-5 if you have the SSN.

Identity Theft

For many years the SSDI was unfairly accused of being a source of information for identity thieves. Quite the opposite, it has long been a mainstay in preventing identity theft. Attempted identity theft of a deceased individual is quickly detected by checking the SSDI (or DMF).

However, thieves recently found a way to utilize the SSDI to commit identity theft. Identity thieves discovered that the Internal Revenue Service doesn’t check the SSDI and started filing fraudulent tax refund claims using the SSNs of recently deceased individuals.

Legislation

Four legislative proposals are currently under consideration to prevent this abuse from continuing. Three would delay release of SSNs of recently deceased individuals. One would totally eliminate the SSDI/DMF.

HR 3475: http://www.gpo.gov/fdsys/pkg/BILLS-112hr3475ih/pdf/BILLS-112hr3475ih.pdf
SB 1534 : http://www.gpo.gov/fdsys/pkg/BILLS-112s1534is/pdf/BILLS-112s1534is.pdf
HR 3482: http://www.gpo.gov/fdsys/pkg/BILLS-112hr3482ih/pdf/BILLS-112hr3482ih.pdf
HR 3215: http://www.gpo.gov/fdsys/pkg/BILLS-112hr3215ih/pdf/BILLS-112hr3215ih.pdf

The Records Preservation and Access Committee (RPAC) issued the following statement:

RPAC and its sponsoring organizations (FGS, NGS, IAJGS)  all submitted Statements for the Record supplementing the transcript of the March 20, 2012  hearing of the Subcommittee on Fiscal Responsibility and Economic Growth of the Senate Committee on Finance  entitled “Tax Fraud by Identity Theft, Part 2:  Status, Progress, and Potential Solutions” in response to a committee invitation to do so.

RPAC Statement for Record SFC

FGS Statement for Record SFC

NGS Statement for Record SFC

IAJGS Statement for Record SFC

Follow their blog at http://www.fgs.org/rpac.

Tuesday, April 10, 2012

Browsing an Image-Only Collection

I came across an image so hard to read on Ancestry.com, I doubted the original microfilm could possibly be as bad. It was the World War I draft registration card for Elseberry Allen of San Antonio, Texas:

World War I draft registration card for Elseberry Allen of San Antonio, Texas on Ancestry.com

Since FamilySearch is working to get its microfilm collection posted online, I checked to see if they had World War I draft registration cards posted. If they had, I wouldn’t have to resort to microfilm.

They had, but it was not indexed.

No need to fear. Browsing to a record is challenging, but can be done simply.

In this case, I had the advantage of having already found the record using the index on Ancestry.com. The name was Elseberry Allen. More importantly, above the image Ancestry.com gave the browse path: Texas > San Antonio City > 3 > Draft Card A.

The browse path of an image on Ancestry.com

Here’s how I found the image on FamilySearch.org.

1. I started at http://FamilySearch.org. (You’re welcome to follow along. You’ll learn better that way.)

2. Down next to the world map I clicked on All Record Collections. You do this whenever you wish to browse a collection without an index.

3. In the collection name Search box I started to type words from the collection title. “Draft” is all it took to narrow the list sufficiently.

4. I clicked United States, World War One Draft Registration Cards, 1917-1918.

5. I clicked Browse through 25,007,403 images. No, I didn’t plan to browse through all 25 million. Mathematically, on average, I only have to browse half of them.

Just kidding. Ancestry.com and FamilySearch divide up a record collection’s images into small sets. I didn’t expect to look at more than several dozen images, max.

6. I saw from the Ancestry browse path that I needed to click on Texas and then on San Antonio City no 3; A-O. This brought me to a set of 4,590 images (which roughly corresponded to a roll of microfilm).

Now a microfilm reader has a crank. A computer does not. Advancing a roll of microfilm by one crank, or a couple, or even several dozen is very intuitive. No one has yet brought the same intuitive function to browsing online images. So here’s whatcha do:

7. I hit the right arrow twice until I reached the first record, Albert Aaron.

The arrows are hidden in light gray and positioned so you’ll never find them. Think of it like a treasure hunt between you and FamilySearch’s designers. Hint: Currently they are hidden along the right edge of the window, just above Save and Print. Don’t worry, FamilySearch will move them to keep you sharp. (“Kato: I mean it. Do not attack me now…”)

8. I took a wild guess that the image number for Elseberry Allen was 1,000 (out of the 4,590 images). I entered 1000 (not 1,000) in the image box and then played treasure hunt again with FamilySearch. Don’t bother trying to find the button to click; there isn’t one. Press Enter on the keyboard. That took me to the record of Tom J. Cecil.

Image number of an image on FamilySearch.org

Too far, but at least I knew that Allen was between Aaron and Cecil. I wrote this down, leaving plenty of room between the two:

3 – Aaron






1,000 – Cecil

9. I took another guess, not quite as wild. I went to image 200, which turned out to be Atkinson. I knew Allen was between Aaron and Atkinson, so I replaced Cecil with Atkinson, like this:

3 – Aaron





200 – Atkinson
1,000 – Cecil

10. Next I guessed 100, and found Altmeir:

3 – Aaron




100 – Altmeir
200 – Atkinson
1,000 – Cecil

11. I checked 50 and found Aguilar. Allen was after Aguilar, so I wrote this down:

3 – Aaron
50 – Aguilar



100 – Altmeir
200 – Atkinson
1,000 – Cecil

12. I guessed 80. It was Alfred George Allen, a little before Elseberry Allen.

3 – Aaron
50 – Aguilar
80 – Allen, Alfred George


100 – Altmeir
200 – Atkinson
1,000 – Cecil

13. That left less than twenty images to look at. Twenty images isn’t bad at all. I used the right arrow several times and found Elseberry Allen at image 85.

World War I draft registration card for Elseberry Allen of San Antonio, Texas on FamilySearch.org

I was right. The original microfilm was much better than Ancestry’s.

And to find it I didn’t have to look at all 25 million images in the collection.

I didn’t have to look at all 4,590 images in the image set.

I looked at eleven.

Monday, April 9, 2012

Monday Mailbox: Insider Unfair

Insider Unfair to Archives.com

Dear Insider,

The NARA census website is working well now (Tuesday night, Pacific time), and my Twitter and Facebook friends have been reporting greatly improved site response starting mid-day, so it seems like they got things under control.

They were hit with probably 4x their estimated traffic yesterday and then those massive numbers *increased* today, Tuesday. There's only so fast anyone can reasonably scale up with those kinds of numbers of visitors, and with such "heavy" (big high-res images) data.

So yeah, maybe you could cool it a bit with the "ZOMG EPIC FAIL!" stuff. We all know that Ancestry.com really wanted to win the 1940 Census contract over Archives.com, and yet they didn't, so you certainly wouldn't want your readers accusing you guys of "sour grapes" now, right?

Signed,
Asparagirl *

Dear Asparagirl,

I am not part of Ancestry.com nor do I represent them. My opinions are my own.

When you say that "we all know that Ancestry.com really wanted..." I must confess I did not know. I’m surprised to learn they had bid on the contract, so your information is news to me. Might I inquire what your source is for that news?

Signed,
--The Insider

Insider Unfair to Ancestry.com

Dear Insider,

Seems disingenuous for you to call Ancestry 3rd, when they have the highest quality images up, as well as have always been reachable, when Archives.com (the NARA site) wasn't even usable that first day, and barely improved the 2nd day. Personally, I would rather get something (even if they aren't all there) than sit waiting endlessly for empty images.

Additionally, you haven't mentioned that Ancestry has the only currently searchable indexes up (except the 500 or so records from somebody else)

Do you have an axe to grind from your time at Ancestry?

The least you could do for you readers is provide a balanced opinion. IMO the best experience over the whole 1940 excitement is and has been Ancestry.com

Signed,
The Rowdy *

Dear Rowdy,

NARA did the image scanning so Ancestry.com’s images can’t be better than everybody else’s.

Your point on usability is well made, but I assigned places based on the order in which the horses crossed the finish line, not for how pretty they looked when they did so.

Perhaps Archives.com should have been disqualified since they started at the finish line while everyone else started at the starting line. Hit head on by a water cannon, they stumbled backwards while the rest of the field closed down on them. That they fought their way back and crossed the finish line more than a day ahead of Ancestry.com earned them a second place finish, despite the deluge that continued unabated.

Don't give up on your favorite horse just yet. The race for indexes has just began and as you point out, Ancestry.com has established an early lead.

Signed,
--The Insider

Insider Unfair to FamilySearch

No one wrote in to say I was unfair to FamilySearch! Maybe I’m being too easy on my employer.

They remain a long way from the finish line. They are by far the slowest horse. The best they can do is take fourth. I’ve seen better looking horses in a glue factory. I’ve seen faster horses on a carousel.

How was that? Any takers?

Friday, April 6, 2012

Serendipity in 1940

It is as though our ancestors want to be found. Uncanny coincidence. Olympian luck. Phenomenal fate. Tremendous intuition. Remarkable miracle. We call It, “Serendipity in Genealogy.”

With the release of the 1940 census this week, indexers have all had the same dream: download a batch and win the lottery. You all know which lottery I’m talking about: pulling up that census page, start typing in names, and find your own family.

The probability of getting any one particular image, your father’s family for instance, is 1 in 3.8 million, according to numbers released by the National Archives.

That’s exactly what happened to one of my coworkers, Mike Hall.

“The very first batch I pulled up was from Kansas,” Mike said. “I downloaded the batch and found it was from Geary County in the northeast corner of the state. I looked and there was my father, my grandmother, and my aunt.”

The Hall family in the 1940 Census in Junction City, Kansas

Sure, picking the specific state helped his odds tremendously. But he wasn’t finished. After indexing that batch, he downloaded another.

“My second batch was from Cowley county in the southeast corner of the state. Everyone on the page was a relative: great aunts and so forth.”

Mike Hall relatives in the 1940 census in Cowley, Kansas

The chance of getting the first Kansas batch was one in 50,760. The probability of getting those two Kansas batches was one in 2.6 billion.

“Do you feel lucky,” my friend said, quoting Clint Eastwood’s Dirty Harry character. “I do.”

His final comment was an invitation to everyone to help index. (See Indexing.FamilySearch.org.) “Who are you going to find?”

#1940Census Status Update for 5 April 2012

Ancestry.com didn’t seal up 3rd place Thursday. They increased from 66% to 86%. They will almost certainly finish today (Friday).

FamilySearch nearly doubled the amount of data online, advancing from 14% to 26%.

FIndMyPast.com and RootsPoint.com remain at zero.

Archives.com put together a graphic showing how spectacularly large the traffic was to Archives.gov during the first two days of the census launch. See the graphic here.

FamilySearch map of indexing project completionAs the census image race winds down, the indexing race is heating up. Accordingly, FamilySearch has posted a heat map of the United States. The darkness of each state indicates how complete the indexing project is. Hover the mouse over a state and a popup shows how complete the indexing project is.

Currently Delaware is 99%, Colorado is 72%, Kansas is 30%, Oregon is 25%, and the remaining indexing projects are 1 to 4%.

Let the next race begin!

Thursday, April 5, 2012

#1940Census Status Update for 4 April 2012

Census enumerator speaks to man on tractorToday MyHeritage crossed the image finish line. Early in the day they finished loading all states. The online buzz about MyHeritage has been very positive. I declare them first place.

Archives.gov/Archives.com seems to be working now. They haven’t restored all the functionality they turned off, such as image pan and zoom, but they are successfully serving users. What’s more, they are handling more traffic than they measured on Monday. (This underlines the gargantuan size of Monday’s traffic. Not only was there more traffic than what was serviced, there was probably more traffic than what could be measured.) They have accomplished a megafeat and I’m giving them second place.

Ancestry.com hit 66% complete, leaving one third of the data still to load. They’ve announced a number of states are in process or coming soon (see the notes column in the table below). They will probably cross the finish line and take 3rd place Thursday or Friday.

FamilySearch advanced from 10% to 14% on Wednesday. Interestingly, as of 6pm Wednesday, they have a number of states that are available for indexing, but are not available for image access: California, Minnesota, New Hampshire, and Pennsylvania. I guess that shows FamilySearch’s priority: indexing over image availability. (Have I hit you up lately to help out? Indexing.familysearch.org )

FamilySearch also announced the order in which they will probably post indexing projects. (See the Indexing column in the table.) I would expect the images are published for browsing in approximately the same order.

FIndMyPast.com remains at 0%.

RootsPoint.com is a new entry in the race. I learned of them from an article by Kimberly Powell of About.com. They have not yet posted any images. They are owned by IIMI (Intelligent Image Management Inc.) IIMI is an offshore indexing company. When organizations don’t have volunteer indexers, they go to companies like IIMI to have the indexing done “off shore.” This particular one is located in India and claims it has been hired by previous companies to index all previous U.S. censuses and U.K. censuses from 1841 to 1871. They are currently working on the 1901 census. IIMI is indexing the 1940 Census on their own initiative with the plan to sell (license) it to interested parties. RootsPoint.com will also make the data available to the general public. Basic fields will be available at no cost; additional fields will be available by subscription. For more information, ready their complete press release.

 

State

Data (GM)

FamilySearch

FamilySearch Indexing

Ancestry.com

Ancestry Notes

MyHeritage

Alabama

316.0

3-Apr

Indexing

4-Apr

 

3-Apr

Alaska

12.6

 

50

   

3-Apr

American Samoa

0.8

 

54

2-Apr

 

3-Apr

Arizona

58.6

 

36

 

Coming

3-Apr

Arkansas

194.6

 

30

   

3-Apr

California

818.3

 

Indexing

3-Apr

 

3-Apr

Colorado

136.8

2-Apr

Indexing

4-Apr

 

4-Apr

Connecticut

175.8

 

37

 

In process

4-Apr

Delaware

30.4

2-Apr

Done

2-Apr

 

4-Apr

District of Columbia

66.8

 

38

2-Apr

 

4-Apr

Florida

197.0

2-Apr

Indexing

 

In process

4-Apr

Georgia

332.0

 

24

4-Apr

 

4-Apr

Guam

1.4

 

52

2-Apr

 

3-Apr

Hawaii

51.3

 

51

 

Coming

3-Apr

Idaho

62.5

 

25

   

4-Apr

Illinois

864.0

 

17

 

In process

3-Apr

Indiana

353.0

 

18

2-Apr

 

3-Apr

Iowa

279.0

 

39

   

4-Apr

Kansas

197.0

2-Apr

Indexing

4-Apr

 

4-Apr

Kentucky

298.0

 

29

4-Apr

 

3-Apr

Louisiana

259.3

4-Apr

Indexing

   

4-Apr

Maine

105.0

 

40

2-Apr

 

3-Apr

Maryland

199.5

 

26

   

4-Apr

Massachusetts

484.3

 

31

4-Apr

 

3-Apr

Michigan

584.0

 

32

 

Coming

3-Apr

Minnesota

320.5

 

Indexing

   

4-Apr

Mississippi

250.2

4-Apr

Indexing

   

3-Apr

Missouri

431.0

 

20

4-Apr

 

3-Apr

Montana

73.0

 

41

 

Coming

4-Apr

Nebraska

144.0

 

27

4-Apr

 

4-Apr

Nevada

14.1

 

33

2-Apr

 

3-Apr

New Hampshire

48.3

 

Indexing

 

Coming

4-Apr

New Jersey

435.0

 

42

 

In process

3-Apr

New Mexico

62.4

 

43

 

In process

4-Apr

New York

1,494.5

 

16

3-Apr

 

2-Apr

North Carolina

403.0

 

21

3-Apr

 

3-Apr

North Dakota

77.4

 

44

   

4-Apr

Ohio

826.4

 

22

3-Apr

 

3-Apr

Oklahoma

270.0

3-Apr

Indexing

4-Apr

 

4-Apr

Oregon

129.5

2-Apr

Indexing

4-Apr

 

3-Apr

Panama Canal Zone

6.3

 

53

2-Apr

 

3-Apr

Pennsylvania

1,086.2

 

Indexing

3-Apr

 

2-Apr

Puerto Rico

186.0

 

56

   

3-Apr

Rhode Island

127.8

 

45

2-Apr

 

2-Apr

South Carolina

230.1

 

34

   

4-Apr

South Dakota

83.1

 

46

 

In process

4-Apr

Tennessee

341.5

 

28

3-Apr

 

3-Apr

Texas

775.2

 

15

3-Apr

 

3-Apr

Utah

68.0

 

19

   

4-Apr

Vermont

38.7

 

47

4-Apr

 

4-Apr

Virgin Islands

2.0

 

55

2-Apr

 

3-Apr

Virginia

286.8

2-Apr

Indexing

3-Apr

 

3-Apr

Washington

258.0

 

23

3-Apr

 

4-Apr

West Virginia

210.6

 

48

4-Apr

 

4-Apr

Wisconsin

359.3

 

35

 

Coming

4-Apr

Wyoming

33.3

 

49

4-Apr

 

3-Apr

             

Total

15,150.2

2,073.0

10,018.7

15,150.2

   

14%

 

66%

 

100%

Data as of

 

6:00pm MDT, 3 Apr.

 

4:30pm MDT, 4 Apr.

 

8:00am MDT, 4 Apr.

FamilySearch Indexing Numbers

In their latest Indexing Update, FamilySearch revealed some interesting numbers.

Since 2006, FamilySearch has dramatically improved its ability to scan and convert microfilmed records into digital images.  In addition, more than 185 digital camera crews are now at work throughout the world, capturing images six days a week. Between microfilm digital conversions and new field captures, FamilySearch now creates an average of 10 new digital images every second of every day and publishes them at familysearch.org within a matter of weeks.

Wednesday, April 4, 2012

#1940Census Status Update for 3 April 2012

Census enumerator visiting a trailer parkToday the MyHeritage horse appeared out of nowhere and moved into first place with 58% of the census published. If that weren’t enough, they were audacious enough to publish the first of their name index: Bristol County, Rhode Island. These guys mean business.

Ancestry.com is running a close second, with 46% of the census published. Percentages refer to gigabytes of images for states completely loaded. Unlike other horses, Ancestry.com allows access to states during loading, so their percentage is higher than indicated.

FamilySearch fell to a distant third with just 10% online.

Tuesday evening I saw my first images on Archives.gov/Archives.com. Three of the first dozen crashed my  browser. Sure; I’m running Firefox with 11 tabs open, but it’s never a good sign when a particular website triggers multiple browser crashes. Still, one has to hope that they have turned the corner. The crashes stopped once I closed half the tabs.

Which sites have loaded which states? Archives.gov (Archives.com) has them all, if you can get it to work for you. Others have:

State

Data (GB)

FamilySearch

Ancestry.com

MyHeritage

Alabama

316.0

3-Apr

 

3-Apr

Alaska

12.6

   

3-Apr

American Samoa

0.8

 

2-Apr

3-Apr

Arizona

58.6

   

3-Apr

Arkansas

194.6

   

3-Apr

California

818.3

 

3-Apr

3-Apr

Colorado

136.8

2-Apr

   

Connecticut

175.8

     

Delaware

30.4

2-Apr

2-Apr

 

District of Columbia

66.8

 

2-Apr

 

Florida

197.0

2-Apr

   

Georgia

332.0

     

Guam

1.4

 

2-Apr

3-Apr

Hawaii

51.3

   

3-Apr

Idaho

62.5

     

Illinois

864.0

   

3-Apr

Indiana

353.0

 

2-Apr

3-Apr

Iowa

279.0

     

Kansas

197.0

2-Apr

   

Kentucky

298.0

     

Louisiana

259.3

     

Maine

105.0

 

2-Apr

3-Apr

Maryland

199.5

     

Massachusetts

484.3

   

3-Apr

Michigan

584.0

   

3-Apr

Minnesota

320.5

     

Mississippi

250.2

   

3-Apr

Missouri

431.0

   

3-Apr

Montana

73.0

     

Nebraska

144.0

     

Nevada

14.1

 

2-Apr

3-Apr

New Hampshire

48.3

     

New Jersey

435.0

   

3-Apr

New Mexico

62.4

     

New York

1,494.5

 

3-Apr

2-Apr

North Carolina

403.0

 

3-Apr

 

North Dakota

77.4

     

Ohio

826.4

 

3-Apr

 

Oklahoma

270.0

3-Apr

   

Oregon

129.5

2-Apr

   

Panama Canal Zone

6.3

 

2-Apr

 

Pennsylvania

1,086.2

 

3-Apr

2-Apr

Puerto Rico

186.0

     

Rhode Island

127.8

 

2-Apr

2-Apr

South Carolina

230.1

     

South Dakota

83.1

     

Tennessee

341.5

 

3-Apr

 

Texas

775.2

 

3-Apr

3-Apr

Utah

68.0

     

Vermont

38.7

     

Virgin Islands

2.0

 

2-Apr

3-Apr

Virginia

286.8

2-Apr

3-Apr

3-Apr

Washington

258.0

 

3-Apr

 

West Virginia

210.6

     

Wisconsin

359.3

     

Wyoming

33.3

   

3-Apr

         

Total (Gigabytes)

15,150.2 GB

1,563.5 GB

6,997.5 GB

8,780.0 GB

   

10%

46%

58%

Data as of

 

6:30pm MDT, 3 Apr.

4:30pm MDT, 3 Apr.

5:30pm MDT, 3 Apr.

Tuesday, April 3, 2012

Details Behind the Failed 1940 Census Launch

1940census.archives.gov was down most of the official launch dayDuring the live launch event of the 1940 Census, Census Director Robert Groves was set to search for a member of his family. But the image never loaded, according to a story in the Chicago Tribune. Upon launch the website 1940census.archives.gov was immediately overwhelmed. In the first three hours, the website had 22.5 million hits.

The National Archives contracted with Inflection LLC, parent of Archives.com, to host the website.

“We want to apologize to the millions of people who came to the 1940 census website this morning in search of information about their family history,” the company said in a statement. “We take full responsibility for the technical issues that have occurred and are very sorry for the inconvenience you may have experienced.”

The contract between the government and Archives.com parent, Inflection, specified that they had to

4.4.1 Support up to 10 million hits per day, while providing response times of less than three seconds for keyword searches of the descriptive metadata. A hit is defined as a request for a file from the web server.

4.4.2 Support up to 25,000 concurrent users.

4.4.3 Scale on demand in the event that 10 million hits and/or 25,000 concurrent users are exceeded.

For “scale on demand,” Archives.com utilizes services from online bookstore vendor, Amazon.com. In addition to its online store, Amazon also provides “in the cloud” services to some websites. Theoretically, Amazon cloud services can be easily scaled to meet unexpected needs. Apparently, scalability is not completely transparent.

“We'd like to thank Amazon.com, who has been helping us with some of the scalability challenges we're tackling and lending important technical expertise,” said Archives.com.

Archives.com engineers worked through the night to fix issues, according to the company. Overnight engineers disabled some functionality, hoping to relieve the load on the overburdened servers. As I write this mid-day Tuesday, the website is still not functional. The site displays text without graphics or formatting. Click the image at the top of the page to see what the page should look like.

The company continues to work today to solve issues.

“Genealogists often claim that theirs is the biggest hobby in America,” said NPR host Robert Siegel. “It's very hard to find hard data to support that, but this would come pretty close if there are that many millions of people who are trying to get in.”

#1940Census Last Status Update for 2 April

1940 Census History Is STILL Waiting

As I write this it is closing on the close of 2 April in the Eastern United States and it’s time for a final status update on the launch of the 1940 U.S. Census. If you receive my articles via e-mail, you’ll also find updates from throughout Monday.

  • Ancestry.com – Ancestry finished the day with images loaded for four states and these territories: American Samoa, Delaware, District of Columbia, Guam, Indiana, Maine, Nevada, Panama Canal, Virgin Islands. Some images are available from several more states as image loading progresses: California, New York, Pennsylvania, Rhode Island, and Virginia.
  • FamilySearch.org – It appears FamilySearch’s plan for the day was to load just the five pre-announced states: Colorado, Delaware, Kansas, Oregon, Virginia. Once those states were loaded, their focus seemed to turn to indexing.
  • Indexing.FamilySearch.org – FamilySearch finished the day with indexing projects available for all the five states except Delaware, which was previously available. Does that mean we finished indexing Delaware in one-half day?!? I’ll check at work Tuesday. There are arbitration batches available for all five states.

    I earned bragging rights by indexing a batch on the first day. I did page 1B of Philomath, Benton, Oregon. (Somebody beat me to page 1A.)
  • 1940census.archives.gov was down most of the official launch day1940census.archives.gov – As I expected, an unexpectedly large number of visitors crashed this, the official website of the National Archives. It was unavailable until the end of the day. In case it is down again Tuesday, click the image at right to see how it is supposed to appear.

    The website was designed for 10 million hits per day, with the ability to scale even larger on demand. According to a spokesperson for hosting company, Archives.com, the site got 37 million hits during the first eight hours. Archives.com issued this statement about the outage.
  • FindMyPast.com – A spokesperson for Find My Past said they will start posting images in a couple of days and have all of them up in the next two weeks.
  • Archives.com is hosting the National Archives census website, so I wasn’t completely surprised that they didn’t have anything on their own website (that I could find). Still, they could have commercialized their own site, which they couldn’t do on the government site.
  • MyHeritage.com – After the close of the business day, the MyHeritage website began allowing searches of the census data. See http://www.myheritage.com/1940census. I saw no indication as to which states were online. I tried a few: Utah, no; Delaware, no; District of Columbia, no; Vermont, no; Ohio, no. I feel somewhat confident in saying they probably had no images loaded Monday.

I’m awarding first place of Day One of the horse race to Ancestry.com for getting all or parts of 15 states/territories online. Second place goes to FamilySearch, who got five states loaded and sent off to us volunteers (we volunteers?) for indexing. The National Archives (Archives.gov) and host Archives.com did not place, giving what many are calling a dismal performance.

Happy Anniversary to Me

Today marks the 5th Anniversary of the Ancestry Insider. Thanks for all your support.

Enough celebrating. Now get back to work…

Monday, April 2, 2012

#1940Census 6:20pm EDT Status Report

1940 Census History Is STILL Waiting

FamilySearch Indexing has released the first indexing project early. Projects for the first states are available now!

NARA/Archives.com pushed live a new copy of the website software around 4pm EDT hoping to solve problems, but problems persist. The requirement from NARA specified that Archives.com had to support a load of 10 million hits per day, with the ability to handle even larger loads on demand. According to a spokesperson for hosting company, Archives.com, NARA got 37 million hits between 9am and 5:17pm EDT. Archives.com issued this statement.

  • Ancestry.com – Images available for four states (and these territories): American Samoa, Delaware, District of Columbia, Guam, Indiana, Maine, Nevada, Panama Canal, Virgin Islands.

    Image load is in progress for California, New York, Pennsylvania, Rhode Island, Virginia.
  • FamilySearch.org – FamilySearch has finished loading the five states they pre-announced: Colorado, Delaware, Kansas, Oregon, Virginia.
  • Indexing.FamilySearch.org – FamilySearch has finished loading these states: Delaware, Oregon, Virginia.
  • 1940census.archives.gov – Basically “crashed” because of large demand. Displays text only. Couldn’t get images to display. They tried revving the software around 4pm EDT, but no luck.
  • FindMyPast.com – A spokesperson for Find My Past says they will start posting images in a couple of days and have all of them up in the next two weeks.
  • Archives.com – Couldn’t find any 1940 data on their site, but does have a link to the NARA site, which they host.
  • MyHeritage.com – Couldn’t find any 1940 data.

History is Still Waiting - #1940Census

1940 Census History Is STILL Waiting

1940 Census 4:00pm EDT Status Report.

I’ve updated my 1940 Census button shown to the right. FamilySearch leads in states loaded by one. Ancestry.com holds the lead for total states and territories. NARA/Archives.com is still mostly crashed. FindMyPast doesn’t intend to post images today. MyHeritage does intend to do so.

  • Ancestry.com – Commenters have pointed out that just because Ancestry.com includes a state in the interface dropdown, it doesn’t mean that all the images are present. Images available for four states (and these territories): American Samoa, Delaware, District of Columbia, Guam, Indiana, Maine, Nevada, Panama Canal, Virgin Islands. Image load is in progress for California, New York, Pennsylvania, Rhode Island, Virginia.
  • FamilySearch.org – FamilySearch has finished loading the five states they pre-announced: Colorado, Delaware, Kansas, Oregon, Virginia.
  • Indexing.FamilySearch.org – No indexing projects. None are projected until 10 MDT this evening.
  • 1940census.archives.gov – Basically “crashed” because of large demand. Displays text only. Couldn’t get images to display.
  • FindMyPast.com – A spokesperson for Find My Past says they will start posting images in a couple of days and have all of them up in the next two weeks.
  • Archives.com – Couldn’t find any 1940 data on their site, but does have a link to the NARA site, which they host.
  • MyHeritage.com – Couldn’t find any 1940 data.

#1940Census 2:00pm EDT Status Report

1940 Census 2:00pm EDT Status Report.

Ancestry.com leads the horse race followed by FamilySearch.org. NARA/Archives.com is currently a distant third. If they were to fix their server problems, they would be #1. MyHeritage and FindMyPast are no-shows so far.

  • FamilySearch.org – ED information searchable for Delaware. Apparently all Delaware images are available, but some errors occur when viewing.
  • Indexing.FamilySearch.org – No indexing projects. None projected until this evening.
  • 1940census.archives.gov – Basically “crashed” because of large demand. Displays text only. Couldn’t get images to display.
  • Ancestry.com – Images available for 14 states (and territories): American Samoa, California, Delaware, District of Columbia, Guam, Indiana, Maine, Nevada, New York, Panama Canal, Pennsylvania, Rhode Island, Virgin Islands, Virginia.
  • MyHeritage.com – Couldn’t find any 1940 data.
  • FindMyPast.com – Couldn’t find any 1940 data.
  • Archives.com – Couldn’t find any 1940 data on their site, but does have a link to the NARA site which they host.

1940 Census 1:00pm EDT Status Report

1940 Census 1:00pm EDT Status Report.

Ancestry.com leads the horse race followed by FamilySearch.org. NARA/Archives.com is currently a distant third. If they were to fix their server problems, they would be #1. MyHeritage and brightsolid are a no-show.

  • FamilySearch.org – ED information searchable for Delaware. Apparently all Delaware images are available, but some errors occur when viewing.
  • Indexing.FamilySearch.org – No indexing projects.
  • 1940census.archives.gov – Displays text only. Couldn’t get images to display.
  • Ancestry.com – Images available for 13 states (and territories): American Samoa, California, Delaware, District of Columbia, Guam, Indiana, Maine, Nevada, New York, Panama Canal, Pennsylvania, Rhode Island, Virgin Islands.
  • MyHeritage – Couldn’t find any 1940 data
  • FindMyPast.com – Couldn’t find any 1940 data
  • Archives.com – Couldn’t find any 1940 data on their site, but does have a link to the NARA site which they host

1940 Census 10:00 EDT Status Report

8:00 MDT Status report