Saturday, March 7, 2009

Thanks to a Wonderful Volunteer

For the past few months I've had some invaluable help from a new friend online. I want to publicly thank Jeffery Mance, of Canada, for his consistent and unflagging attention to detail in reporting broken links to me for Cyndi's List. He has been methodically going through the pages for the United Kingdom, sending me the broken link reports and supplying updated URLs for the links whenever possible. He has also pointed out errors in link placement so that I could fix those as well. I'm working my way through all of these reports and extremely thankful that Jeffery has already done half of the work for me. Thank you, Jeffery!

If you want to report broken links you find on Cyndi's List, please see:

To Use the Wayback Machine or Not Use the Wayback Machine...

I've got a dilemma. Something I can't seem to make a decision on. Whether or not I should link to archived links within the Wayback Machine Internet archives. I'll elaborate more on my dilemma in a moment.

First, visit the Wayback Machine (part of the Internet Archive) to see what I'm referring to. It was founded in 1996 with the mission to archive all web pages on the Internet. There are many pros and cons to the idea of archiving web pages. You can find conversations on this topic throughout the message boards on the Internet Archive web site. Briefly--once a web page has disappeared from the Internet it is nice to have an old, archived copy to view in order to find information you might need. As a genealogical researcher you might find important data that had been published in the past, but is no longer available on a current web page. Handy. However, on the con side of this issue we run into privacy and intellectual property issues. Does a third-party web site have the right to archive your written works and republish them on their web site without your permission?

You won't find Cyndi's List on the Wayback Machine. Early on I ran into a problem with the archives. People started reporting problems to me regarding broken links to their web sites. I looked on Cyndi's List and found that the links were not broken. Upon further investigation I found that they were referring to outdated links that they found on the Wayback Machine on archived versions of Cyndi's List. Of course, I ran into this same problem with cached versions of the site found on Google, and I periodically get these reports from people of broken links they find in the RootsWeb mailing list archives for the CyndisList Mailing List. Each time I have to explain to these people that they are viewing archived versions of links, not the live versions found on my site. To resolve this problem I have had Cyndi's List removed from archiving on Wayback and Google. The very nature of my site is to be as current and up-to-date as I can make it, so there is no need for archived versions with outdated links.

So, here is my problem. Tonight I'm working on a broken link to a census transcript for a small village in England. The page is no longer online, but it is archived at the Wayback Machine. To date I have always tried to find a replacement URL for a broken link. If there are no replacement addresses available I will delete the link. Instead, should I be updating the link with a URL that points to an archived version on Wayback? On one hand, I want to provide the link to get people to the genealogical information they need in their research. That is my purpose in maintaining Cyndi's List. On the other hand, I realize that the owner of the original web site must have had a reason for removing the page from the Internet. Maybe they lost interest in genealogy. Or they died. Or they didn't feel like sharing anymore. Or maybe the data was no longer free for them to publish. Perhaps they didn't have permission to publish it in the first place. Or maybe they donated it to someone else to publish. I could go on and on with possible reasons. Is it up to me to point people to an archived copy of a web page that the original author may or may not want shared?

The genealogist in me wants to help others and point to the archives. The writer/publisher in me doesn't want to set a precedent by linking to the archives only to find out that it becomes a problem down the road when someone objects. Further, if there are any copyright issues involved I can be held responsible by linking to the archives because it facilitates the propagation of the copied work by sending people in that direction.

Of course, more problems may arise when it comes to fixing broken links in the future if the Wayback Machine web programmers make changes that alters the configuration of the URLs causing all of the archive links to break. Sigh.