Digital Tools Summit - Interpretation
From ToolCenter
Tools for interpretation:
This is not an existing tools project but rather a proposal for a tools project arising from the Digital Tools Summit at the University of Virginia. For more on the Summit see http://www.iath.virginia.edu/dtsummit/ or notes at http://tada.mcmaster.ca/Main/ToolSummitNotes.
Interpretation develops out of an encounter with material or experience, and out of a reaction to some provocation--in the form of ambiguity, contradiction, suggestion, aporia, uncertainty, etc. In literary interpretation, you start with reading, and when you stumble on an ambiguity, you decide if this is an interesting ambiguity, possibly a meaningful one, possibly an intended one. Next, you ask what opportunities for interpretation are offered by this ambiguity? In the next phase, interpretation moves from private to public, from informal to formal, as you rehearse and perform it, intending to persuade other readers to share your interest and your conclusions.
Commentary is one way to convey interpretation, and it can be embodied as annotation. Annotation might need to be attached to several points in the corpus of material under study: annotation always needs at least one point of attachment. You could have classes of commentary as well: a note to myself, a note to share, a note that has been peer-reviewed, a note that other people have noticed, a note that has been the subject of commentary, etc. Such annotations should attach to any type of media, should allow production of commentary in many media as well.
The discussion group on tools for interpretation identified the following abstract sub-components of annotation, as an interpretation-building process, grouped here by phases:
Phase 0
0.1 Identify the environment (discipline, media)
0.2 Encounter a resource (search, retrieval)
Phase 1
1.1. Explore a resource
1.2 Vary the scope/context of attention
Phase 2
2.1 Tokenize, segment the resource (automatically or manually)
2.2 Naming parts, renaming parts
2.3 Align annotation with parts (including time-based material)
2.4 Vary or match the notation of the original content
Phase 3
3.1 Sort and rearrange the resource (perhaps in something as formal as a semantic concordance, perhaps just in some unspecified relationship)
3.2 Identify and analyze patterns that arise out of relationships
3.3 Code relationships, perhaps in a way that encourages the emergence of an ontology of relationships (Allow formalizations to emerge, or to be brought to bear from the outset, or to be absent)
We considered that phases 0 and 1 were probably outside the scope of our immediate charge (though we hoped that other groups, like the group focusing on exploration, might help with some of these phases), and we thought that phases 2 and 3 were probably pretty squarely within the territory of tools for interpretation.
Further, we thought that tools for interpretation should ultimately allow you to do these things (including phases 0-3) in arbitrary order, and on or off the web (in the field, in other words). Though actually publishing annotations/interpretations/commentary is probably out of scope for a tool for interpretation, narrowly defined, we agreed that there's no question that one would want to disseminate interpretation at some point in the process, and that those annotations should ideally be connected to networked resources and to other interpretations.
We spent some time discussing the audience for the kind of tools we were imagining: developers? power users? All humanists? High school students? WIth respect to users, we agreed that it was best to develop for an actual use, not a hypothetical one, but that it was also salutary to build for more than one use, if possible. This brought up the question of whether we envisioned tools for more than one (concurrent) user: in other words, are we talking about seminar-ware? How collaborative should these tools be, and how collaborative must they be? Should they have an offline mode (for some, the answer to this question was clearly yes)? Should they allow, support, or require serial collaboration? In the end, we decided that the best compromise was a single-user tool designed in awareness of a collaborative architecture (and we hoped to get some more information about what such an architecture might look like, from the collaborative group).
We also discussed some more specific technical matters, for example:
- Should the tools be general and broad or deep and specific? Probably the latter, but... Could we imagine an architecture that supports both? Perhaps.
- Could we specify some minimal general modalities for recognizing a unit/token? Yes: for example, a) unit is a file, b) unit is defined with a separator, c) unit is drawn by hand.
- What should be the data input options for this tool, or toolkit? Certainly, at a minimum, text, image, and time-dependent media (that might leave out GIS, though; is the distinction proprietary/non-proprietary formats?)
- Should these tools have a common data-modeling language? For example, UML, Topic Maps, something else? We decided that this would be necessary, but it could be kept under the hood, as it were—optionally available for direct access by the end-user.
- Could this be a browser plug-in (a Firefox extension)? Might this be (in its seminar-ware version) an extension to Sakai?
At this point, in an effort to bring our discussion to bear on a particular tool, and to cut short an abstract discussion of tools (in general) for interpretation, we focused on a very specific kind of tool for annotation, namely a "highlighter's tool." We supposed that this tool would:
- Work with texts, images, time-dependent media, annotated resources, or no primary source at all.
- Work on-line or off-line.
- Allow you to demarcate what's of interest (visually or by hand) according to a user-specified rule or an existing markup language specified in a standard grammar.
- Allow you to classify the highlighting according to your own evolving organizational structure, or according to an existing ontology/taxonomy specified in some standard grammar.
- Allow you to cluster, merge, link, overlap the demarcated parts.
- Allow you to attach annotation to the demarcated parts.
- Allow you to attach annotations to clusters and/or links.
- Allow you to search at least your own annotations.
- Produce output in some standard grammar.
- Accept its own output as input.
- Allow you to do these things in arbitrary order.
- Allow you to zoom to or select arbitrary levels of detail.
Well pleased with ourselves for being so close to actual specs for an actual tool, we decided to go a step further and name some specific examples of uses and users. The following list suggests the range of topics, sources, and goals that we hope such a tool (or toolkit) might support:
- Chopin Variorum Project: http://www.ocve.org.uk/ This project represents a number of different editions of a work of music. The user wants to be able to study facsimiles of the documents, talk about the way they were printed, their differences, their coordinated parts, and comment on those parts and their relationships.
- A scholar currently writing a book on Anglo-American relations, who is studying propaganda films produced by US and UK governments and needs to compare these with text documents from on-line archives, coordinate different film clips, etc.
- The Dobbs project: http://ils.unc.edu/annotation/publication/ruvane_aag_2005.htm A geographer needs to reconstruct a history of 18th-century land settlement in the North Carolina Piedmont region, coordinating 6,000 historical documents with GIS data.
- The Society of Professional Indexers or others creating indexes based on textual or other data.
- Open Journal Systems might use this as an add-on tool for readers (or reviewers) of journal articles.
- A UBC project looking at medical literature in electronic form, such as how autistic children interact with psychiatrists or how psychiatrists communicate with one another, could use this tool to comment on those interactions and communications in multiple media.
- Anthropologists studing Migmaq language and culture, an ongoing scholarly enterprise that involves coordination of thousands of pages of texts, could use it for documents that are hard-to-parse for orthographic reasons.
- Variations3: http://newsinfo.iu.edu/news/page/normal/2453.html This is a large database of online music, needing annotation tools. Users will be able to play back music, view and annotate the score, and will have a tool for drawing score pages and a thematic annotation tool for audio resources. Work on such tools is already underway, in the context of this project and its datatypes.
- The Salar-Monguor Project: an endangered language documentation project that deals with language variation and language contact. It involves multilingual resources and multimedia, with collaborative annotation by scholars at all skill levels.
At the end of the discussion, a straw poll showed that half of the eighteen people in the room wanted to build this kind of tool, and all of them want to use it. We closed the discussion by affirming, once again, that we should build for particular applications and users but also in view of an agreed-upon set of requirements. The building process should include communication, if not collaboration, with other developers. We hope that follow-up from this event will result in people in this discussion realizing a framework for collaboration, and building tools for interpretation such as the ones imagined in this discussion.