Vitruvian design for scholarship in the humanities

Saturday, October 27, 2012

Wait, wait, don't tell me

Two colleagues recently forwarded me a pair of links they thought I would find interesting. One was an MLA job listing for an assistant professor "in American or British literature, 16th-20th century, with interest in the problematic of digital humanities". It included the specification, "Some familiarity with MSWord expected." The other was for a new journal called Digital Philology. The author's guidelines include the invitation, "Digital Philology is welcoming submissions for its 2013 open issue. Inquiries and submissions (as a Word document attachment) should be sent to" ...

I actually had to read each of these twice to realize that one was intended as a parody, and the other is apparently intended seriously. In the spirit of the news quiz on NPR's "Wait, wait, don't tell me," you decide: which one is the parody?

- a job for assistant professor asking for familiarity with MS Word
- a new journal Digital Philology soliciting submissions as emails with attached Word document

If you are unable to tell, the links lead to the full listings on the original web pages, where you'll find further clues.

Welcome to the world of digital humanities and digital philology in 2012.

Monday, October 8, 2012

wikicite?

I recently heard a radio interview with Timothy Messer-Kruse, describing his experience editing the wikipedia article on the Haymarket trial. (He had earlier described the same experience in the Chronicle of Higher Education, online here.)

The striking point is that his edits on wikipedia were repeatedly reverted because they were based on and supported by primary evidence. Wikipedia is, by design, intended to reflect consensus opinions as reflected in secondary publications. This is a horrible inversion of the way history should be studied and presented — one that wikipedia shares with encyclopedias in general, including distinguished specialized encyclopedias like the Oxford Classical Dictionary.

While I don't like encyclopedias, I love the crowd-sourced part of wikipedia. So what if we created a "wikicite" for classical studies? Imagine a wiki where only primary sources were allowed: no reference to any kind of secondary publications permitted. You are of course welcome to read them on your own time, and maybe even learn something from them, but to post to wikicite, you would actually have to work back to the primary sources, and confront evidence you could cite.

That would be revolutionary.

Sunday, August 26, 2012

Mark[up|down]

People sure hate the pointy brackets. I've been writing markup since version 2 of the Text Encoding Initiative Guidelines in SGML in the 1980s, and as easy as modern tools like oXygen make it today, even I'm not crazy about it. Over the past year or so, I've looked at every "markdown" alternative to markup that I could find: to name a few, textile, markdown, multimarkdown, reStructuredText, and more wiki languages than I can list.

All of them seem to have a similar history: somebody wanted to have a quicker and easier way to express HTML, cooked up a tool to convert some simple "markdown" conventions to HTML, and realized, "Hey, this is useful!" As a result, all of the markdown languages share two main drawbacks. The first is that they generally express only the semantics of HTML, or a subset of HTML. That eliminates their application to any writing or editing that requires richer semantics such as an XML language could supply, but more fundamentally, the markdown languages suffer from a long-recognized limitation: they're not specified. It should be obvious that "Whatever my converter tool handles" is NOT a specification, but that's the state of most of the markdown schemes. (See these comments from more than a decade ago about reStructuredText!)

For those cases where I really just want a quick and easy way to bang out HTML-like content, John Grueber's markdown seems to offer the best compromise. First, if you really need some particular piece of HTML beyond what markdown offers, you can just embed it in your text (although at the price of reintroducing some pointy brackets). But what really persuades me is the pegdown processor.
Pegdown uses parboiled's "parsing expression grammars" (PEGs) so it comes closer to a separately specified definition of the language than a code library full of regular expressions emitting some kind of converted text. pegdown will give you an abstract parse tree for your markdown content, which makes me feel much more confident using markdown down from code I write.

Add to that the ever growing number of editors and other tools that support markdown in all kinds of contents, and I'm converted. So was the text of this post — from markdown to html.

Tuesday, July 31, 2012

In a small discipline, proxy repositories

Software builds on other software. With a build system like gradle, once you declare how your code depends on other code, the build system checks your declaration with listed repositories, and downloads appropriate packages as they are needed. If you are coding in a JVM language, you can find an enormous proportion of the libraries you might want from maven central, either directly or via a proxy.

But if you routinely work with ancient Greek, or in any similarly specialized domain, the situation is different. Hugh Cayless' epidoc transformer package is indispensable for my routine work, for example, but for a few minutes yesterday, the one repository where it's regularly hosted was down. I was paralyzed.

The solution is as easy as it is obvious: smaller communities, like those interested in ancient Greek, need to ensure that the collections of material they depend on are proxied and available from multiple repositories.

I'm using Nexus to host material developed for the Homer Multitext project, and yesterday configured it to proxy dev.papyri.info/maven, where the epidoc transcoder is housed. The unified front to all the material hosted and proxied there is http://beta.hpcc.uh.edu/nexus/content/groups/public/.

Nexus is a "lazy" proxy: it only acquires local copies of a proxied package when it is actually requested. One way to guarantee that your favorite proxying site has all the packages you want is with a minimal build, that creates dependencies on everything you might want, and then simply lists their names. The example below is a gradle build to do just this. The repository URL and version strings for packages are kept in a separate properties file, but this example is otherwise complete: running the showAll task will force the proxy server to retrieve any packages it does not already have locally stored.

repositories {
maven {
url "${repositoryUrl}"
}
}
configurations {
classics
}
dependencies {
classics group: 'edu.harvard.chs', name : 'cite', version: "${citeVersion}"
classics group: 'edu.harvard.chs', name : 'greekutils', version: "${greekUtilsVersion}"
classics group: 'edu.holycross.shot', name : 'hocuspocus', version: "${hocusPocusVersion}"
classics group : 'edu.unc.epidoc', name: 'transcoder', version : "${transcoderVersion}"
}
task showAll {
description = "Downloads and shows a list of useful code libraries for classicists."
doLast {
println "Downloaded artifacts (including transitive dependencies):"
configurations.classics.files.each { file ->
println file.name
}
}
}

Monday, July 9, 2012

"Abolish the journals"

I'm appearing on a panel next spring on the subject of "publishing" at the Classical Association of the Midwest and South. Would it be too much to suggest that Walter Olson's critique of law reviews applies equally well to academic journals in the humanities?

Olson quotes Harold Havighurst:

Whereas most periodicals are published primarily in order that they may be read, the law reviews are published primarily in order that they may be written.

Sounds pretty much like the academic journals I'm familiar with in classics.

(H/T: groklaw news picks for the link to Olson's blog.)

Thursday, July 5, 2012

CC licenses for photography of manuscripts

If you're interested in manuscripts of Greek and Latin texts, this week saw a seismic shift in the scholarly landscape. The e-codices project, which has been putting high-quality digital images of manuscripts in Switzerland on the web for several years, has now standardized on a Creative Commons license for all of its images.

In this decision, they are following the lead of a growing number of projects and institutions. I greatly admire the similar work Will Noel has done at the Walters Art Gallery, where high-resolution photography of more than 250 manuscripts is on line, available under a CC license.

Photographed manuscripts now in e-codices number more than 900. Like the Digital Walters Art Gallery, manuscript photography in e-codices is accompanied by a scholarly catalog entry.

The digital archivists are doing their job. Now the only question is whether we can find the scholars of Greek, Latin and other languages to read these beautifully documented texts.

Sunday, June 3, 2012

Who owns Plato?

I attended the workshop "édition des textes et recherche interdisciplinaire" at the École Normale Supérieure last week. As I mentioned in a preceding post, I'd been thinking about Eben Moglen's talk "Innovation under Austerity," and since I expected that introducing Moglen's argument might be a bit provocative for the traditional audience I expected at the ENS, I cleverly thought I would win them over, or at least delay their criticism, by paraphrasing one of Moglen's memorable soundbites: "No one owns Plato."

Not so clever. Apparently, when you gather in the august Salle des Actes at ENS, you can meet people who believe they do own Plato, and don't care to share with others who fall short of their standards, thank you very much.

(In the foreground, keynote speaker Gregory Crane, director of the Perseus Project defensively photographs the photographer; partially masked by the screen are the plaques on the walls of the Salle bearing the names of such distinguished scholars in many fields as Louis Pasteur and Fustel de Coulanges.)

Just for fun, I googled the phrase "plato download": as the screen grab illustrates, google estimated something over 17 million hits for that phrase, including texts in Greek and translation in a variety of languages, podcasts and ebooks (as well as downloads of software packages named after the son of Ariston). I also found the Wikipedia article on Ruhollah Khomeini noting that Khomeini considered Plato's views "in the field of divinity" to be "grave and solid". (Since some of the would-be owners of Plato also object to Wikipedia, I can pass along its reference to Kashful-Asrar, p. 33 as the source of that assertion.)

So while I can appreciate highly theorized concerns about the preparation needed to appreciate Plato "properly", the Anglo-Saxon empiricist in me looks at these Google search results and still wonders — just who exactly owns Plato?