Using Digital Media Files for Genealogy

Using Digital Media Files for Genealogy



Introduction

A lot of people ask how best to handle digital media in their genealogy research. This blog post explains some of the benefits of using digital media, and highlights some important aspects to bear in mind.

This blog post does not address how to best cite digital sources; that will be the topic of a future post.

What is different about digital media?

There are big differences between what you might call traditional paper sources (I will call these physical sources) and what most people predominantly use today: online digital sources.

First, the physical sources exist, you can touch them. They don’t disappear overnight (unless there is a fire). Digital sources can disappear overnight, they can be duplicated with no effort so may appear in many places at the same time, they are essentially ephemeral.

Second, digital sources are either copies of physical sources that already exist, or they are databases that you use to find the original sources. Behind every digital source there is a physical source somewhere. If the digital source consists of images of the original physical sources then it can generally be considered to be a primary source.

Do not rely on an online digital source always being there. A business holding the digital source might go out of business, or a database may become inaccessible. For at least this reason always, always, always take a copy of what you found, an image or copy of a transcription, and save it to your computer. But remember that the image you now have a copy of may be copyrighted, so always beware making it available to other people in a way that violates the copyright. For example, if you found an image of a page of the England and Wales 1939 Register on Ancestry, you are OK to take a copy for your own use, or to attach the image to a person in your Ancestry tree, but you are _not_ OK to put the image on a public website (you may do so if you pay a fee to The National Archives and watermark the image).

 

This is an example of the sort of watermark that is required for a copyright image.

File names

This is where you face your first challenge: what should you call the digital media file on your computer?

Ideally before you start, sit back and think what you are trying to achieve when you name a digital file. Are you going to hunt through these files by their names at a later date? Almost certainly yes. As you are using the file for genealogy research, you are likely to need to search for it by the year it was created, by the name of a person mentioned in it, or by the type of document it is (census, birth certificate, etc).

A good policy is to use file names with the following structure:

<Family name> <Given names> <Document type> <year>

For example: Smith John Birth 1948.jpg, or Smith John Jones Emily Marriage 1847.jpg.

 

Example of file names for marriage records


The <Family name> and <Given names> in the file name are the names exactly as found within the document. I will explain the reason why further down this blog post.

The <Document type> you use really depends on exactly what research you are doing, but “birth”, “baptism”, “marriage”, “death”, “burial”, “probate”, “will” is a good start.

The <year> in the document name is usually clear - year of birth registration for an image of a birth certificate, year of probate for a page of the National Probate Calendar. Remember, it’s the year of the document. I personally solve the problem of photos of gravestones by using the first death/burial year shown on the gravestone.

Folders

Your second challenge will only become apparent at some point in the future, but it is a very good idea to plan for it now.

Over time you will build up a large number, thousands, of digital files relating to your research, and it will be difficult to cope with them all in one folder on your computer. You need to use a set of folder in a tree, so that each document goes in just one of those folders.

A simple set of folders would closely match the document types above, but does not have to do so.

 

A simple folder structure for genealogy files

You should choose whatever folders suit the way you work. And as with file naming, be consistent.

Some people use a folder for each surname. I prefer not to do this for the reasons given in the next section: people change their names.

Why have the document type in the folder name as well as the document name? Because you may take copies of some of the documents in the future and need to know what the document is, that’s why.

You also need to decide where on your computer to put these folders. I recommend creating a single starting folder within which _all_ your genealogy digital files will be saved. Create your folder tree within this single folder. This will make it easy to back up (see later), easy to find, and also prevent inadvertent duplication of files. If you have multiple clients and projects, as I do, then create an identical folder tree for each.

What name to use for the person and in document names?

Should you use the name as found in the document, or the name of the person it actually relates to?

This is a good question, and there’s no right answer. What is important is to always be consistent to yourself.

Good practice is to always name the file with exactly the same name as appears in the document. This is because the file name describes the document and what is found in the document.

So documents relating to a woman who marries twice and adopts her husbands’ names will have one of three names; those for a person who changes their whole family name will appear under one of two names.

Occasionally the document will list _both_ names - I have found the National Probate Calendar entry for a man born as William Montague Lawrence who died as William Montague Loft, and it lists both his names - in cases like this it would be best to name the file after the name that they were using at the time the original (physical) document was created.

 

Two names for the same person


Of course there may be different spellings of the name - I recommend always using the spelling found in the document itself, on the basis that this is the spelling needed to locate the document in its source. On one project I have a family whose name appears variously as Altaresky, Alteresky, and Altereski over a ten year period.

What name to give the person in your genealogy software? It is preferable always to use the person’s birth name for this, because this label identifies the person; if they change their name they are still the same person. If there is any doubt about spelling, choose the first spelling that they themselves wrote (as opposed to being written by somebody else) if possible. In the example above, I chose to use Altaresky because that is how they signed their name in their marriage register entry. You should also record all the other spellings as alternative names.

Altaresky as signed by the groom is different from Alteresky as written by the clerk.


Your genealogy software

What role does your genealogy software play?

  1. it holds a database of people, events, and relationships
  2. it holds references to sources where information came from
  3. it holds a database of digital media files
  4. it displays family trees visually and helps you to navigate through them
  5. it provides tools for you to search through the databases to find specific people, and sometimes specific data

It’s this search engine function (5) that is easy to overlook. Your media file naming and folder structure goes some way to achieving this outside your genealogy software, but it doesn’t enable you to search every possible way. Your genealogy software provides more advanced searching.

One advantage of this search is: if the file name contains the person’s name as shown in the document, and the person in the genealogy software is called by the birth name, then you have a way of finding the person and hence the document by searching for the actual name in the document (outside your genealogy software) or for the birth name (within your genealogy software).

1, 2, and 3 are provided by paper based systems too. But 4 and 5 are the extra value the software brings to you.

 

Advanced search using Family Tree Maker


Take backups

Because of the ephemeral nature of digital files, and their vulnerability to loss or damage, you need to adopt a policy to ensure you can recover from accidents.

  1. your master set of digital media files should be on your computer where you have full control of it.
  2. if you are sharing your family tree using an online tree service like Ancestry or Find My Past, then copies of some of your the digital media files will also be held there, but these should always be copies of your masters.
  3. if you are synchronising between Family Tree software on your computer and an online Family Tree Service (eg Family Tree Maker software and the Ancestry service) you need to be alert. Best practice is to do ALL of your work in the Family Tree software, and only use the online service to display the results of your work but never to make any changes. Why? Because when you synchronise changes that you made in the online service any digital media files from the service will be given names and stored in locations that do NOT match your naming and folder structure above.
  4. take regular and frequent backups of your master media files to a location that is NOT on your computer. For example, copy them to OneDrive or DropBox: these online (”cloud”) services are more secure and resilient than anything you could create at home. Or copy them to a flash drive and ask a relative to look after it for you. Or do both. Note that you should also separately take backups of your Family Tree.

File formats

Next, a little bit about file formats.

The trouble with digital media is that it is just numbers on a computer. You need to use a software application to turn those numbers into an image on your screen. Most of the time this seems to magically happen on its own just by double clicking on the file name - this is just your computer trying to make it easy for you by using the right software application automatically.

The problem with this is, at some point in the future, the right software application may no longer be available or may not run on your next, more modern, computer. When that happens you will not be able to view the document on your screen. This never happens with physical documents which at worst fade.

So you need to choose what file formats to use in order to lessen the chance of this happening. And there is no way to do this without being a bit technical. The key principle to use is to try to avoid proprietary file formats which are controlled by a commercial organisation. Instead try to always use formats that are open standards. It’s not always easy to determine this without getting your hands dirty so here goes a short suggestion.

For images:

  1. JPG, JPEG is an open standard image format and is the format you are most likely to encounter on the internet. This makes it a good candidate for your digital media files, with a warning: if the image file has been created badly it will be full of visual artifacts especially around letters and you will not be able to get rid of them later.
  2. TIF, TIFF is a proprietary image format owned by Adobe. It allows for zero artifact images (compared to 1).
  3. PNG is an open standard image format that allows for images with no artifacts.
  4. BMP produces large files, and is not always recognised as an image file. If possible try to avoid this (your scanner may create them by default).
  5. Digital camera “raw” formats are very proprietary. If your camera can shoot “raw” try to set it to save both a “raw” image and a JPG image.

And for words:

  1. PDF is a proprietary file format owned by Adobe.
  2. TXT is a completely open text file format (for notes and transcriptions), not an image file format, but does not allow for text formatting.
  3. Microsoft documents are all proprietary file formats, but if you use the versions with an ‘x’ in the file extension you will at least be able to  read them (Word DOCX, Excel XLSX). Note that you can also save Microsoft documents in Open Document format, which is non-proprietary.

Metadata

Finally a few words about metadata.

Metadata is information stored within a digital file that describes what the file contains. If it is available (not all file formats allow for this) there is great value in using this metadata in your genealogical research, to record things like what document the image is of, who appears in the image, and also copyright information. Then any time you send a copy to somebody else, or they change the name of the file, there is a record inside the file itself of what the file is.

If you use JPG or TIF file format then you can easily add metadata without the need for special software. On a Windows PC simply right click on the file, select properties at the bottom of the popup menu that appears, and click of the details tab. Then add whatever you want in the “Description” and “Origin” sections and hit OK.

 


Metadata for a JPG file on Windows

 

Conclusion

The availability of digital media files is a gift to the Genealogist, allowing you to acquire large amounts of reliable source materials from around the world and then search through them rapidly and effortlessly.

 

The Tree Sleuths, 2020. The Tree Sleuths website.

 

Comments

Popular posts from this blog

Free Genealogy Data

1921 England & Wales Census - some tips and tricks

Black History Month - African Astronomy