I was very pleased to join Dan Cohen, Tom Scheinfeldt, Mills Kelly, and (for the first time!) Amanda French for Episode 69 of Digital Campus: “Strange Bedfellows.”
I am very flattered by the number of congratulations I’ve received on and offline for the publication of my book, Reading Machines: Toward an Algorithmic Criticism. (University of Illinois Press). Many have even pre-ordered it, and I fully intend to reimburse anyone who does that by buying them beer. But I must point out that the book is not actually out. When I try to click through and order a copy, it tells me that it won’t be available until the first week of December.
I don’t know how accurate that is (it’s the first time I’ve read an actual release date). I can say that the book is very close to being ready for the printer.
But now that it has been revealed, I want to take this opportunity to point out how amazing the cover is. I didn’t have any brilliant suggestions to make about cover art (I’m kind of a text guy), and so I just left it to the design department at Illinois. They hired a brilliant designer named Alex de Armond (http://www.alexdearmond.com/) who created the image by layering blank pages from Google Books, thus creating a texture somewhat like the layers of an onion skin. The thumb, of course, is one of the infamous artifacts from the Google Books scanning process. I am so very, very glad that I left this decision to others. I think it’s fantastic.
And really, why not judge a book by its cover?
This past term I gave a graduate course, in collaboration with Jeff Trzeciak, on “Technologies of Communication” and tried something that feels a bit subversive: I didn’t assign any essays. Of course, I’m almost certainly not the first humanities professor to not assign essays in a graduate course (I haven’t bothered looking for other examples though one could check the syllabus finder), but it still it still goes against the grain of all my own experiences as a student and challenges what I take to be one of the most common practices of assessment in the humanities.
My reasoning was that students would probably only gain a slight incremental benefit from writing Yet Another Essay (even if we all benefit from every instance of writing and receiving feedback, no matter where we are in our careers). However, if I could formulate some assessment modules that encouraged students to express themselves with unconventional technologies – and, crucially, think about the process of formulating their ideas and arguments with the constraints and affordances of those technologies – then that experience would be much more valuable to them. Besides, I tend to like experimenting with pedagogy and don’t need much of an excuse do so.
There are the four seasons – Winter, Spring, Summer, Fall – but there are also other annual phases that my academic physiology anticipates and experiences. I’m not talking about Oobleck, but rather things like the planning of the intense conference season in digital humanities (May-June), start of the academic year (September), preparation of my annual report (January), and result from the Standard Research Grants programme at the beginning of April. Happily, this season brought two bits of good news about successful proposals:
A few weeks ago, I realized that I no longer use graphical applications.
That’s right. I don’t do anything with GUI apps anymore, except surf the Web. And what’s interesting about that, is that I rarely use cloudy, AJAXy replacements for desktop applications. Just about everything I do, I do exclusively on the command line. And I do what everyone else does: manage email, write things, listen to music, manage my todo list, keep track of my schedule, and chat with people. I also do a few things that most people don’t do: including write software, analyze data, and keep track of students and their grades. But whatever the case, I do all of it on the lowly command line. I literally go for months without opening a single graphical desktop application. In fact, I don’t — strictly speaking — have a desktop on my computer.
I think this is a wonderful way to work. I won’t say that everything can be done on the command line, but most things can, and in general, I find the CLI to be faster, easier to understand, easier to integrate, more scalable, more portable, more sustainable, more consistent, and many, many times more flexible than even the most well-thought-out graphical apps.
I realize that’s a bold series of claims. I also realize that such matters are always open to the charge that it’s “just me” and the way I work, think, and view the world. That might be true, but I’ve seldom heard a usability expert end a discourse on human factors by acknowledging that graphical systems are only really the “best” solution for a certain group of people or a particular set of tasks. Most take the graphical desktop as ground truth — it’s just the way we do things.
I also don’t do this out of some perverse hipster desire for retro-computing. I have work to do. If my system didn’t work, I’d abandon it tomorrow. In a way, the CLI reminds me of a bike courier’s bicycle. Some might think there’s something “hardcore” and cool about a bike that has one gear, no logos, and looks like it flew with the Luftwaffe, but the bike is not that way for style. It’s that way because the bells and whistles (i.e. “features”) that make a bike attractive in the store get in the way when you have to use it for work. I find it interesting that after bike couriers started pairing down their rides years ago, we soon after witnessed a revival of the fixed-gear, fat-tire, coaster-break bike for adults. It’s tempting to say that that was a good thing because “people didn’t need” bikes inspired by lightweight racing bikes for what they wanted to do. But I think you could go further and say that lightweight racing bikes were getting in the way. Ironically, they were slowing people down.
I’ve spent plenty of time with graphical systems. I’m just barely old enough to remember computers without graphical desktops, and like most people, I spent years taking it for granted that for a computer to be usable, it had to have windows, and icons, and wallpapers, and toolbars, and dancing paper clips, and whatever else. Over the course of the last ten years, all of that has fallen away. Now, when I try to go back, I feel as if I’m swimming through a syrupy sea of eye candy in which all the fish speak in incommensurable metaphors.
I should say right away that I am talking about Linux/Unix. I don’t know that I could have made the change successfully on a different platform. It’s undoubtedly the case that what makes the CLI work is very much about the way Unix works. So perhaps this is a plea not for the CLI so much as for the CLI as it has been imagined by Unix and its descendants. So be it.
I’d like this to be the first of a series of short essays about my system. Essentially, I’d like to run through the things I (and most people) do, and show what it’s like to run your life on the command line.
First up . . .
I think most email programs really suck. And that’s a problem, because most people spend insane amounts of time in their email programs. Why, for starters, do they:
1. Take so long to load?
Unless you keep the app open all the time (I’m assuming you do that because you have the focus of a guided missile), this is a program that you open and close several times a day. So why, oh why, does it take so much time to load?
What? It’s only a few seconds? Brothers and sisters, this is a computer. It should open instantaneously. You should be able to flit in and out of it with no delay at all. Boom, it’s here. Boom, it’s gone. Not, “Switch to the workplace that has the Web browser running, open a new tab, go to gmail, and watch a company with more programming power than any other organization on planet earth give you a . . . progress bar.” And we won’t even discuss Apple Mail, Outlook, or (people . . .) Lotus Notes.
2. Integrate so poorly with the rest of the system?
We want to organize our email messages, and most apps do a passable job of that with folders and whatnot. But they suck when it comes to organizing the content of email messages within the larger organizational scheme of your system.
Some email messages contain things that other people want you to do. Some email messages have pictures that someone sent you from their vacation. Some email messages contain relevant information for performing some task. Some email messages have documents that need to be placed in particular project folders. Some messages are read-it-later.
Nearly every email app tries to help you with this, but they do so in an extremely inconsistent and inflexible manner. Gmail gives you “Tasks,” but it’s a threadbare parody of the kind of todo lists most people actually need. Apple mail tries to integrate things with their Calendar app, but now you’re tied to that calendar. So people sign up for Evernote, or Remember the Milk, or they buy OmniFocus (maybe all three). Or they go add a bump to the forum for company X in the hope that they’ll write whatever glue is necessary to connect x email program with y task list manager.
I think that you should be able to use any app with any other app in the context of any organizational system. Go to any LifeHacker-style board and you’ll see the same conversation over and over: “I tried OmniOrgMe, but it just seemed too complicated. I love EternalTask, but it isn’t integrated with FragMail . . .” The idea that the “cloud” solves this is probably one of the bigger cons in modern computing.
Problem 1 is immediately solved when you switch to a console-based email program. Pick any one of them. Type pine or mutt (for example), and your mail is before your eyes in the time it takes a graphical user to move their mouse to the envelope icon. Type q, and it’s gone.
Such programs tend to integrate well with the general command-line ecosystem, but I will admit that I didn’t have problem 2 completely cracked until I switch to an email program that is now over twenty years old: nmh.
I’ve written elsewhere about nmh, so allow me to excerpt (a slightly modified) version of that:
The “n” in nmh stands for “new,” but there’s really nothing new about the program at all. In fact, it was originally developed at the RAND Corporation decades ago.
We’re talking old school. Type “inc” and it sends a numbered list of email subject lines to the screen, and return you to the prompt. Type “show” and it will display the first message (in any editor you like). You could then refile the message (with “refile”) to another mailbox, or archive it, or forward it, and so on. There are thirty-nine separate commands in the nmh toolset, with names like “scan,” “show,” “mark,” “sort,” and “repl.” On a day-to-day basis, you use maybe three or four.
I’ve been using it for over a year. It is — hands down — the best email program I have ever used.
Why? Because the dead simple things you need to do with mail are dead simple. Because there is no mail client in the world that is as fast. Because it never takes over your life (every time you do something, you’re immediately back at the command prompt ready to do something else). Because everything — from the mailboxes to the mail itself — is just an ordinary plain text file ready to be munged. But most of all, because you can combine the nmh commands with ordinary UNIX commands to create things that would be difficult if not impossible to do with the GUI clients.
I now have a dozen little scripts that do nifty things with mail. I have scripts that archive old mail based on highly idiosyncratic aspects of my email usage. I have scripts that perform dynamic search queries based on analysis of past subject lines. I have scripts that mail todo list items and logs based on cron strings. I have scripts that save attachments to various places based on what’s in my build files. None of these things are “features” of nmh. They’re just little scripts that I hacked together with grep, sed, awk, and the shell. And every time I write one, I feel like a genius. The whole system just delights me. I want everything in my life to work like this program.
Okay, I know what you’re thinking: “Scripting. Isn’t that, like, programming? I don’t want/know how to do that.” This objection is going to keep re-appearing, so let me say something about it right away.
The programming we’re talking about for this kind of thing is very simple — so simple, that the skills necessary to carry it off could easily be part of the ordinary skillset of anyone who uses a computer on a regular basis. An entire industry has risen up around the notion that no user should ever do anything that looks remotely like giving coded instructions to a machine. I think that’s another big con, and some day, I’ll prove it to you by writing a tutorial that will turn you into a fearsome shell hacker. You’ll be stunned at how easy it is.
For now, I just want to make the point that once you move to the command line, everything is trivially connected to everything else, and so you are mostly freed from being locked in to any particular type of tool. You can use a todo list program that makes Omnifocus look like Notepad. You can use one that makes Gmail Tasks look like the U.N. Charter. Once we’re in text land, the output of any program can in principle become the input to any other, and that changes everything.
In the next installment, I’ll demonstrate.
(This is a lightly edited version of a post from my DayOfDH (next year there are rumblings that we may be able to aggregate content from our existing blogs instead of self-plagiarizing.)
We now have three years of DayOfDH blogging archives – that’s a pretty rich record of how digital humanists describe their activities in a given day. It also constitutes an interesting corpus for practising what Geoffrey Rockwell and I have been calling rapid analysis – trying to see what one might usefully glean from a relatively quick look at digital texts using specialized tools. Our interest in this is to develop techniques that might be useful to a wide range of people in digital society – for instance, students doing preliminary research or journalists compiling materials for an article.
Building the corpus was relatively, made even easier by a few tweaks kindly done by the DayOfDH team. I downloaded a full RSS archive of each years like this (I did this on the command-line, but you can just open each quoted URL in the browser and save it if it seems more convenient):$ curl "http://ra.tapor.ualberta.ca/~dayofdh/?wpmu-feed=posts" > 2009.xml $ curl "http://ra.tapor.ualberta.ca/~dayofdh2010/?wpmu-feed=posts" > 2010.xml $ curl "http://ra.tapor.ualberta.ca/~dayofdh2011/?wpmu-feed=fullfeed" > 2011.xml
Today NEH hosts the 2010 Start-Up Grant project directors meeting, featuring lightning talks on 46 projects funded in the last round. It’s no secret that I’m a fan of lightning talks, which are short elevator pitches — they’ve been a part of THATCamp and we’ll be hosting a panel at the American Studies Association conference this fall that will include them (more details, soon). As Melissa Terras said in her DH2010 Plenary, we must “be prepared by having at the tip of our tongues what we do and why we matter and why we should be supported and why DH makes sense.” This is good practice.
Since each project will be limited to only two minutes and three slides, I plan to gloss-over what was one of the more considerable parts of my grant proposal: a justification for APIs and funding of a workshop as a level-one startup. Instead, my pitch will address the basics: who, what, when, where, and why. I’ve also decided to share relevant text from my accepted proposal below for those that may be interested in learning more. Feel free to leave a comment or contact me directly.
Level 1 Start Up funding is requested to support a two-day workshop on Application Programming Interfaces (APIs), hosted by the Maryland Institute for Technology in the Humanities (MITH) at the University of Maryland. The workshop will gather 40-50 Digital Humanities scholars and developers who are using or interested in using APIs in their digital projects, industry leaders who will demonstrate their APIs, and practitioners who will help guide the group through the “working weekend.” The workshop will lay the groundwork for the integration of APIs into participant projects, and serve as a platform to develop future ideas for how to share and access humanities data through APIs. Presentations by workshop presenters will be video recorded, and recordings made available online through the workshop and MITH websites. Similarly, workshop activities will be archived on the workshop website, which will act as a clearinghouse and publication of all workshop-related content.
Enhancing the Humanities
Thanks largely to generous funding at both the national and global levels, there are now many large repositories of cultural and scholarly data freely available on the Web. The best of these repositories usually provide tools for searching, viewing, and manipulating their contents; tools, which, following the current conventions and values of web design, are often designed according to an uncluttered, simple aesthetic. These tools make the most common use cases as intuitive and simple as possible, and return nicely formatted results presented as a web page (rather than, say, an XML file or Excel spreadsheet) to be examined within the web space of the archive. There is much to be said for this approach, however, by itself, it can prevent (or make unmanageably difficult) uses of the data that while welcome and useful may not have been imagined by the original designers. A scholar may, for instance, want to ask a question that an elegant but relatively simple search interface does not allow. Another may want to combine the data from two archives together to create a visualization to illuminate previously unknown or unacknowledged connections. The growing number of scholars who have both content expertise and software programming ability increasingly want to access data programmatically (that is, within their own code) rather than by using an intuitive though limited web interface. At the same time, there are often reasons–practical, political, and legal–why it is not always preferable or possible to simply allow users to download the entire dataset for use on their own machines. The most common and arguably best solution to this problem is an Application Programming Interface (API).
An API can be informally defined as a set of published commands that computer programmers can use in their own code to interact with code that they did not write and to which they often have only limited access. For example, an API is often provided to allow third party programmers to retrieve data from a repository that they do not control. The emergence of APIs has facilitated the growth of “mashups”: the combination of data from different sources. Examples of mashups include, for instance, plotting photographs from the photo sharing service Flickr on a Google Map, or dynamically displaying book covers from Amazon.com associated with articles returned from a ProQuest query. Dan Cohen, the Director of the Center for History & New Media has written, “APIs hold great promise as a method for combining and manipulating various digital resources and tools in a free-form and potent way.” Indeed, the potential for APIs to be used in the humanities is significant.
The most-popular APIs have been produced for commercial products including Flickr, Google Maps, and Freebase. Few Digital Humanities scholars, however, have attempted to create APIs for their scholarly repositories. We are therefore organizing a meeting to bring together both Digital Humanities scholars and industry developers in order to study existing APIs and develop a set of recommendations and best practices for API development in the digital humanities. Sessions at the workshop will be videotaped and placed in an online archive to both preserve the work of the sessions and make it available to those who were unable to attend. Events such as the proposed API workshop not only help develop necessary skills within the Digital Humanities community, but serve as a platform for planning, sharing, and innovating.
September – October 2010: The project team will develop and launch the workshop website. This website will house not only information regarding the event but will also serve as a clearinghouse for all materials relating to the workshop.
November 2010: An official call for participants will be published on the MITH website as well as the workshop website. The two-day workshop will be promoted on Digital Humanities email lists and social media sites (such as Twitter). Potential participants will have one month to submit a brief application that explains who they are and what they’d like to do at the event. From that list, MITH staff will choose a group of participants (should the number of interested parties exceed the 50 participants budgeted). Participants will be selected based on their programming ability and their connection to an existing or emergent digital humanities project which might benefit from an API.
December 2010: Participants will be notified through email about the status of their applications.
Late January 2011 or February 2011: The two-day workshop will be held at the University of Maryland in College Park. Each of the two days will have similar structure: mornings will feature talks by representatives from data repositories with existing, exemplary APIs, followed by afternoon breakout sessions in which small groups of digital humanities practitioners will seek to implement the ideas shared in the morning sessions into their own code. Time at the end of each afternoon will accommodate brief presentations of ideas.
March 2011-May 2011: Lester and a web developer will produce a web exhibition of video from the workshops. Lester will produce a white paper and a set of guidelines for designing APIs for the humanities which will be vetted by workshop participants.
Final Product and Dissemination
The workshop will be promoted through a website developed specifically for the event, through various social media including Twitter and Facebook, the Humanist listserv, and MITH’s community mailing list. In addition, video recordings of the speakers will be made available on the website along with notes from participants. Additionally, MITH will publish a set of guidelines for developing web-based APIs for digital humanities projects which will be published on the site and promoted through the same channels that promoted the workshop. The resulting website will act as a resource for those interested in the event, as well as humanities researchers interested in leveraging APIs in their digital work.
Earlier this month the National Endowment for the Humanities announced 28 new awards from their Digital Humanities Start-Up Grants program, including the funding of my proposal to organize and run a two-day API workshop. The workshop will gather 40-50 digital humanities scholars and developers who are using or interested in using APIs in their digital projects, industry leaders who will demonstrate their APIs, and practitioners who will help guide the group through the “working weekend.” The workshop’s abstract is available online.
I was inspired to organize the MITH API workshop by discussion at last year’s NiCHE API workshop, organized by William Turkel. I hope our workshop will build off the success of NiCHE’s event and offer concrete ways that APIs can be integrated into digital humanities projects today. As part of the event, time for hacking/building is scheduled in afternoons to prototype ideas. Video of presentations will be recorded and made available online for those that can’t attend. I’ll blog further details about the API workshop and how to participate in October.
Funded by the same NEH program, congratulations to my colleagues Tanya Clement and Doug Reside on their “Professionalization in Digital Humanities Centers” workshop.