For digital scholars, it's hard to avoid talking or even thinking about scholarly reference systems without getting bogged down in technology-specific tar pits. Take the W3C's Resource Description Framework as an example. Its triplet model is brilliantly simple and powerful: whether you think of it as "subject-verb-object" or "object-directed link-object", it is general, extensible, and lends itself readily to both abstract graph models and real machine implementations. Its syntax is expressed in terms of URIs, which could equally well be URLs (real addresses in the "http" schema), or URNs (abstract names in the "urn" schema). (See the W3C's document "URI Clarification" or this discussion "Untangle URIs, URLs, and URNs" to sort out the acronyms.) But when was the last time you saw a scholarly project using RDF to describe relations among objects identified by abstract name, rather than by address? In an RDF graph, "http" is a good URI scheme for an application retrieving material on the internet, but a poor choice for a description of relations among persistent and immutable objects, a case where the "urn" scheme would be more appropriate.
For canonically citable texts, work at the Center for Hellenic Studies on has led to:
- identifying abstract properties of canonical texts
- developing a human readable and machine actionable notation for citation that expresses these properties (the CTS URN notation)
- defining a service for identifying and retrieving texts identified by this notation (the Canonical Text Services protocol)
I think this graded distinction of abstract property, reference notation, and application could be equally valuable in citing other kinds of material. In a following series of posts, I'll look successively at each level to analyze how we might cite uniquely identified objects.