Underlying the CTS URN notation is the abstract model of textual structure abbreviated as OHCO2.
The generality of this model is nicely illustrated by recent implementations of the Canonical Text Services (CTS) protocol. The CTS protocol provides retrieval of texts by CTS URN: implementations linked from this page use XML tree structures, relational databases and directed graph stores to store and retrieve texts.
For an essential scholarly concept (identifying a citable passage of text), that's a powerful level of abstraction permitting scholars and developers to select technologies best suited to the specific kind of work they want to pursue with a citable corpus.
Saturday, March 7, 2015
OHOC2 FTW
Saturday, October 25, 2014
Open license + an iPad mini
At the end of Open Access Week, I'd like to salute the library of Leiden University for living up to the goal of making open access the norm in scholarship. If you work in its very pleasant setting, there are no restrictions on how you make use of out-of-copyright material.
When I visited Leiden earlier this year, I had an iPad mini with me, so I took a few quick snaps of Codex Vossianus Graecus 1, a set of maps (perhaps sixteenth century) to accompany Ptolemy's Geography. Thanks to the library's policies, I can make images like available as citable scholarly resources.
Leiden University, Codex Voss.Gr. 1: world map in Ptolemy's first projection |
When the phone or tablet you happen to be carrying gives you photos rivaling or surpassing anything published in print, the technology is not much of a barrier. When the default policy is that you can use your photographs as you see fit, neither is legal licensing.
(To see what's legible in a quick and poorly lit snap from an iPad mini, see this zoomable image of folios 2-3.)
Friday, July 4, 2014
Paleography matters in the Declaration of Independence: a CITE response
My colleague Tom Martin points me to this article in the New York Times, reporting that Danielle Allen at the Institute for Advanced Study in Princeton has questioned the National Archives’ transcription of a a crucial phrase in the Declaration of Independence. Are Thomas Jefferson’s “self-evident truths” comprised of individual rights, or do they also include a governmental role “to secure these rights”? Your judgment could hang on whether or not you see a period followed by a long dash or simply a long dash in the original document.
I browsed the National Archives web site, and found that they offer two downloadable images, one a photograph of the original parchment, and another of the 1823 engraving by William Stone, both apparently in the public domain.
So I took a few minutes of my Fourth of July holiday to set up a CITE Image Service where you can browse and create citable references of the images. Here is the detail of the crucial passage in the photograph of the parchment:
.
In the Image Collection I created this afternoon, this detail can be cited generically with this URN
urn:cite:mid:natarchimgs.Declaration_Pg1of1_AC@0.472,0.1872,0.082,0.0213and the URN can also be resolved to see the detail in context.
Contrast the Stone engraving:
(citable as
urn:cite:mid:natarchimgs.Declaration_Engrav_Pg1of1_AC@0.465,0.1919,0.076,0.0177
, and viewable in context here)With references like this, it would be easy to cite other examples in the document of periods and long dashes, much as participants at last week’s Homer Multitext seminar collated evidence to interpret features of the oldest extant manuscript of the Iliad.
Conclusions? The parchment of the Declaration is hard to read, but paleography is important, and the CITE architecture that was originally created for the Homer Multitext project can be applied to any sort of paleographic problem.
Saturday, May 10, 2014
More reasons to love markdown plus critic markup
Deadlines for senior projects mean that in addition to the interesting challenge of how to submit genuinely replicable digital scholarship to the library's institutional repository, it's time to generate pdfs so that the Graphic Arts Department can bind something for the library shelves. The projects I'm advising are formatted in markdown extended to support citation by scholarly URN (what I'm calling "citedown"). We wanted to create markdown source that could be used with leanpub, beautiful docs, or pandoc, so the automated workflow has to handle some potentially complex issues resolving URNs, downloading local copies of embedded images and rewriting references to them, etc.
I had been using critic markup for editorial questions and copy editing, but with one eye on the calendar, I wanted to test the pdf workflow before we had a complete draft with all critic markup resolved.
To my surprise, when we used pandoc to lay out the text with a LaTex book structure, it recognized the critic markup and formatted it in the resulting pdf! Comments default, appropriately, to a screaming magenta that could have been taken from a 1990s GIS palette. (Anyone who forgets to run their automated process to find and resolve critic markup will have a hard time missing these.)
Pandoc has always been a major reason to love markdown's simplicity. Now it's one more reason to consider the combination of markdown plus critic markup.
Wednesday, April 30, 2014
Publishing digital scholarship
- archival material: TEI-conformant editions of texts, and a variety of data sets in simple delimited text formats
- analytical material: expository prose in markdown, using URNs to refer to all citable resources
- source material for user-interfaces: interactive presentations of analytical and archival material as servlets.
- be sure you have Vagrant and Virtual Box installed
- download our virtual machine repository, and run vagrant up in its root directory
Links
- Vagrant: http://www.vagrantup.com/
- Virtual Box: https://www.virtualbox.org/
- The CITE Architecture on github: http://cite-architecture.github.io/ (including the citemgr build project)
Tuesday, April 1, 2014
Get funding for your DH project?
Try these regular expressions on the embedded youtube video:
s/our company/our project/g
s/your company/IT staff/g
s/expert/developer/g