What’s in a Citation
© 3 November 2021 Luther Tychonievich
Licensed under Creative Commons: CC BY-NC-SA 3.0
other posts

An exploration of citation complexity; an extension of two posts ago.


In one of my relative’s family tree appears the following citation:

"New York, New York City Marriage Records, 1829-1940," database, FamilySearch (https://familysearch.org/ark:/61903/1:1:24HG-T2H : 10 February 2018), Peter Tychoniewicz in entry for Gabriel Tychoniewicz and Ahaphia Oleszczuk, 05 Oct 1907; citing Marriage, Manhattan, New York, New York, United States, New York City Municipal Archives, New York; FHL microfilm 1,452,208.

There’s a lot going on here. Let’s try taking it apart.

This single example – literally the first one I grabbed, I didn’t do any looking around to find a special one – indicates several points worth keeping in mind in family history citations.

First, the common presentation of citations is not in an obvious order or format. The general rule is something like most-important-data-first; thus we begin with a name that can indicate what kind of source this is, then how to find it online, then where to look in it once you find it. After that we put the sources of the source, original first then derivative. Some details are also posed as English phrases like “‍in entry for‍” and some details are elided completely, like assuming you know an entry for two people in this database means an entry for their marriage or assuming you know the repository that hosts any microfilm with a name starting “‍FHL‍”.

Second, the citation itself is made of multiple parts and their relationships to one another. Each part is relatively simple: a type and 1–3 details. Most parts are related to just one other part, generally by being within that other part; but some have multiple relationships such as the database being related to the document and to the microfilm. The implicit data might suggest that it’s actually related to the database through the microfilm, making only a single path of relationships, but that implicit data adds in two other branches: the database derives from the microfilm and is hosted in The Internet, and the microfilm derives from the document and is hosted in the Family History Library.

We might be able to finagle this example into a simple non-branching structure by calling repositories properties instead of parts, but in general that won’t be possible. Branching provenance is the common case.

For contrast, let’s look at an academic citation:

Luther Tychonievich and James P. Cohoon. 2020. “‍Lessons Learned from Providing Hundreds of Hours of Diversity Training.‍” In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE ’20). Association for Computing Machinery, New York, NY, USA, 206–212. DOI:https://doi.org/10.1145/3328778.3366930

This citation also has multiple parts:

Several of the previous observations still apply here. Most important first: reputation and originality are the currency of academia, so author and date are the most important, followed by the venue as that is a proxy for exclusivity and thus merit, with data needed to actually locate the article last. The citation is also a branching set of interconnected parts, though in this case the parts are very predictable and can easily be flattened into a couple dozen possible key:value pairs.

Citations, academic and family historical, are a brief presentation of a fairly large amount of interlinked data. For academic citations, the total set of parts is generally quite limited and well handled by various digital formats. For family history citations, the chains get much longer with more branching and there are fewer limitations that can be exploited to simplify them.

Looking for comments…

Loading user comment form…