I’m at Digital Humanities 2012 in Hamburg. I’m writing a conference report on philosophi.ca. The conference started with a keynote by Claudine Moulin that touched on research infrastructure. Moulin was the lead author of the European Science Foundation report on Research Infrastructure in the Humanities (link to my entry on this). She talked about the need for a cultural history of research infrastructure (which the report actually provides.) The humanities should not just import ideas and stories about infrastructure. We should use this infrastructure turn to help us understand the types of infrastructure we already have; we should think about the place of infrastructure in the humanities as humanists.
Susan pointed me to Pundit: A novel semantic web annotation tool. Pundit (which has a great domain name “thepund.it”) is an annotation tool that lets people create and share annotations on web materials. The annotations are triples that can be saved and linked into DBpedia and so on. I’m not sure I understand how it works entirely, but the demo is impressive. It could be the killer-app of semantic web technologies for the digital humanities.
The French have pulled the plug on Minitel, the videotex service that was introduced in 1982, 30 years ago. I remember seeing my first Minitel terminal in France where I lived briefly in 1982-83. I wish I could say I understood it at the time for what it was, but what struck me then was that it was a awkward replacement for the phonebook. Anyway, as of June 30th, Minitel is no more and France says farewell to the Minitel.
Minitel is important because it was the first large-scale information service. It turned out to not be a scalable and flexible as the web, but for a while it provided the French with all sorts of text services from directories to chat. It is famous for the messageries roses (pink messages) or adult chat services that emerged (and helped fund the system.)
In Canada Bell introduced in the late 1980s a version of Minitel called Alex (after Alexander Graham Bell) first in Quebec and then in Ontario. The service was too expensive and never took off. Thanks to a letter in today’s Globe I discovered that there were some interesting research and development into videotex services in Canada at the Canadian Research Communications Centre in the late 1970s and 1980s. Telidon was a “second generation” system that had true graphics, unlike Minitel.
Despite all sorts of interest and numerous experiments, videotex was never really successful outside of France/Minitel. It needs a lot of content for people to be willing to pay the price and the broadcast model of most trials meant that you didn’t have the community generation of content needed. Services like CompuServe that ran on PCs (instead of dedicated terminals) were successful where videotex was not, and ultimately the web wiped out even the services like Compuserve.
What is interesting, however, is how much interest and investment there was around the world in such services. The telecommunications industry clearly saw large-scale interactive information services as the future, but they were wedded to centralized models for how to try and evolve such a service. Only the French got the centralized model right by making it cheap, relatively open, and easy. That it lasted 30 years is an indication of how right Minitel was, even if the internet has replaced it.
The Digging Into Data program commissioned CLIR (Council on Library and Information Resources) to study and report on the first round of the programme. The report includes case studies on the 8 initial projects including one on our Criminal Intent project that is titled Using Zotero and TAPOR on the Old Bailey Proceedings: Data Mining with Criminal Intent (DMCI). More interesting are some of the reflections on big data and research in the humanities that the authors make:
1. One Culture. As the title hints, one of the conclusions is that in digital research the lines between disciplines and sectors have been blurred to the point where it is more accurate to say there is one culture of e-research. This is obviously a play on C. P. Snow’s Two Cultures. In big data that two cultures of the science and humanities, which have been alienated from each other for a century or two, are now coming back together around big data.
Rather than working in silos bounded by disciplinary methods, participants in this project have created a single culture of e-research that encompasses what have been called the e-sciences as well as the digital humanities: not a choice between the scientific and humanistic visions of the world, but a coherent amalgam of people and organizations embracing both. (p. 1)
2. Collaborate. A clear message of the report is that to do this sort of e-research people need to learn to collaborate and by that they don’t just mean learning to get along. They mean deliberate collaboration that is managed. I know our team had to consciously develop patterns of collaboration to get things done across 3 countries and many more universities. It also means collaborating across disciplines and this is where the “one culture” of the report is aspirational – something the report both announces and encourages. Without saying so, the report also serves as a warning that we could end up with a different polarization just as the separation of scientific and humanistic culture is healed. We could end up with polarization between those who work on big data (of any sort) using computational techniques and those who work with theory and criticism in the small. We could find humanists and scientists who use statistical and empirical methods in one culture while humanists and scientists who use theory and modelling gather as a different culture. One culture always spawns two and so on.
3. Expand Concepts. The recommendations push the idea that all sorts of people/stakeholders need to expand their ideas about research. We need to expand our ideas about what constitutes research evidence, what constitutes research activity, what constitutes research deliverables and who should be doing research in what configurations. The humanities and other interpretative fields should stop thinking of research as a process that turns the reading of books and articles into the writing of more books and articles. The new scale of data calls for a new scale of concepts and a new scale of organization.
It is interesting how this report follows the creation of the Digging Into Data program. It is a validation of the act of creating the programme and creating it as it was. The funding agencies, led by Brett Bobley, ran a consultation and then gambled on a programme designed to encourage and foreground certain types of research. By and large their design had the effect they wanted. To some extent CLIR reports that research is becoming what Digging encouraged us to think it should be. Digging took seriously Greg Crane’s question, “what can you do with a million books”, but they abstracted it to “what can you do with gigabytes of data?” and created incentives (funding) to get us to come up with compelling examples, which in turn legitimize the program’s hypothesis that this is important.
In other words we should acknowledge and respect the politics of granting. Digging set out to create the conditions where a certain type of research thrived and got attention. The first round of the programme was, for this reason, widely advertised, heavily promoted, and now carefully studied and reported on. All the teams had to participate in a small conference in Washington that got significant press coverage. Digging is an example of how granting councils can be creative and change the research culture.
The Digging into Data Challenge presents us with a new paradigm: a digital ecology of data, algorithms, metadata, analytical and visualization tools, and new forms of scholarly expression that result from this research. The implications of these projects and their digital milieu for the economics and management of higher education, as well as for the practices of research, teaching, and learning, are profound, not only for researchers engaged in computationally intensive work but also for college and university administrations, scholarly societies, funding agencies, research libraries, academic publishers, and students. (p. 2)
The word “presents” can mean many things here. The new paradigm is both a creation of the programme and a result of changes in the research environment. The very presentation of research is changed by the scale of data. Visualizations replace quotations as the favored way into the data. And, of course, granting councils commission reports that re-present a heady mix of new paradigms and case studies.
A couple of weeks ago I gave a talk at Digital Infrastructure Summit 2012 which was hosted by the Canadian University Council of Chief Information Officers (CUCCIO). This short conference was very different from any other I’ve been at. CUCCIO, by its nature, is a group of people (university CIOs) who are used to doing things. They seemed committed to defining a common research infrastructure for Canadian universities and trying to prototype it. It seemed all the right people were there to start moving in the same direction.
For this talk I prepared a set of questions for auditing whether a university has good support for digital research in the humanities. See Check IT Out!. The idea is that anyone from a researcher to an administrator can use these questions to check out the IT support for humanists.
The other day while browsing around looking for books to read on my iPad I noticed what looked like a dissertation for sale. I’ve been wondering how dissertations could get into e-book stores when I remembered the license that graduate students are being asked to sign these days by Theses Canada. The system here encourages students to give a license to Library and Archives Canada that includes the right,
(a) to reproduce, publish, archive, preserve, conserve, communicate to the public by telecommunication or on the Internet, loan, distribute and sell my thesis (the title of which is set forth above) worldwide, for commercial or non-commercial purposes, in microform, paper, electronic and/or any other formats;
I now just came across this cautionary story in the Chronicle for Higher Education about Dissertation for Sale: A Cautionary Tale. It seems it is also allowed in the US.
For those wondering why I haven’t been blogging and why Theoreti.ca seems to be unavailable, the answer is that the blog has been hacked and I’m trying to solve the problem. My ISP rightly freezes things when the blog seems to send spam. Sorry about all this!
I posted on 4Humanities a questionnaire that I call Check IT Out!. The idea is to give administrators and researchers a tool for checking out the research information technology (IT) that they have at their university. I developed it for a talk I give tomorrow at the Digital Infrastructure Summit 2012 in Saskatoon. I’m on the “Reality Check Panel” that presents realities faced by researchers. Check IT Out! is meant to address the issue of getting basic computing support and infrastructure for research. It is often sexier to build something new than to make sure that researchers have the basics. That raises the question of what are the basics, which is why I thought I would frame Check IT Out! as a series of questions, not assertions. Often people in computing services know the answers to these, but our colleagues don’t even know how to frame the question.
We have just had a short letter published in Medical Teacher on Serious games for patient safety education. The letter reports about the research that a team of us spanning medicine and humanities computing did around using a FPS (First Person Shooter) for teaching medical communication. I reported on a juried exhibit, Insight, that we showed the game at before.
The Canadian Association of University Teachers has a campaign to Save Library and Archives Canada from the “Badly conceived restructuring, a redefinition of its mandate, and financial cutbacks (that) are undermining LAC’s ability to acquire, preserve and make publicly available Canada’s full documentary heritage.” The issue is not just cuts, but how LAC is dealing with the cuts.
Daniel Caron, Library and Archivist of Canada, has announced that “the new environment is totally decentralized and our monopoly as stewards of the national documentary heritage is over.”
LAC will be decentralizing a large portion of its collections to both public and private institutions. LAC documents refer to this voluntary group of “memory institutions” as a “coalition of the willing.”
Go to the site now, read up on the issues, and consider taking action!
From @nowviskie a New York Times article on the New ‘Digital Divide’ Seen in Wasting Time Online.
As access to devices has spread, children in poorer families are spending considerably more time than children from more well-off families using their television and gadgets to watch shows and videos, play games and connect on social networking sites, studies show.
This fits in interesting ways with research I’ve come across in two other contexts. First, it fits with what Valerie Steeves talked about at the GRAND 2012 conference I went to. (See my conference notes.) She reported on her Young Canadians in an Online World research – she has been interviewing young Canadians, their parents and teachers over the years. Between 2000 and now there has been a shift in attitude towards the internet from believing it was good for learning to thinking of it as a minefield.
The other context is a cool book I’m reading on keitai or mobile phones in Japan. Personal, Portable, Pedestrian is a collection edited by Mizuko Ito, Daisuke Okabe and Misa Matsuda about the cell phone phenomenon in Japan. They point out in passing how there are significant national/cultural differences in how technologies are picked up and used.
In the case of the PC Internet, differences in adoption were most often couched in terms of a digital divide, of haves and have-nots in relation to a universally desirable technological resource. By contrast, mobile media are frequently characterized as having different attractions depending on local contexts and cultures. The discourse of the digital divide has been mobilized in relation to Japanese keitai Internet access (see chapter 1) and is implicit in the discourse suggesting that the United States needs to catch up to Japanese keitai cultures. (p. 6)
While we need to be aware of differences in access to technology, we also should be critical of the assumptions underlying the discourse of divides. Why do we assume that the Internet is good and mobiles less so? Why did the Japanese discourse switch from viewing keitai as promoting youth rudeness and isolation to arguing for Japanese technonationalist exceptionalism (we use mobiles more because there is something exceptional about Japanese culture/spirit.)
Which reminds me of a TechCrunch article on How The Future of Mobile Lies in the Developing World. Cell phones for us are one more gadget with which to access the Internet. In the developing world they are revolutionary in that they leapfrogged the problems of physical infrastructure (phone wires) and now provide connectivity for many who had none. It is no wonder that the growth in the cell market is in the developing world.
For many communities, simple voice and text connections have brought about revolutions in access to financial, health, agricultural and education services and opportunities for employment. For example, many farmers in rural areas in Africa and Asia use SMS services to to find out the daily prices of prices of agricultural commodities. This information allows them to improve their bargaining position when taking their goods to market, and also allows them to switch between end markets.
From Slashdot I found this blog entry Ocracoke Island Journal: Nookd about how a Nook version of War and Peace had the word “kindle” replaced by “nook” as in “It was as if a light has been Nooked (kindled) in a carved and painted lantern…” It seems that the company that ported the Kindle version over to the Nook ran a search and replace on the word Kindle and replaced it with Nook.
I think this should be turned into a game. We should create an e-reader that plays with the text in various ways. We could adapt some of Steve Ramsay’s algorithmic ideas (reversing lines of poetry). Readers could score points by clicking on the words they think were replaced and guessing the correct one.
I’m sitting at Congress 2012 in the beer tent at Wilfred Laurier. I’ve been writing a conference report of SDH/SEMI 2012. But in the beer tent they are talking about the ARG that Neil Randall (may have) started called Bonfire of the Humanities. Apparently the dean may have shut it down, but traces are left, see #bonfireofthehumanities. See also the YouTube video, Torch Institute Declares War Against University of Waterloo.
Because some may misunderstand, the Torch Institute is probably is Alternate Reality Game (ARG) satirizing the academy. With ARGs you never know what is real or not. The dean shutting things down, like the removal of the YouTube above may or may not be part of the game script. (You can see other Torch Institute videos here.) The guiding idea behind ARGs is TINAG (This is not a game.) ARGs are supposed to be games only in so far as you play with what may or may not be the game. Who knows about the Torch Institute.
CNN has a story on ‘The Demise of Guys’: How video games and porn are ruining a generation. The book is The Demise of Guys: Why Boys Are Struggling and What We Can Do About It and it is by retired Stanford psychologist Philip Zimbardo and Nikita Duncan. The book builds on a TED Talk that argues that:
“Boys’ brains are being digitally rewired for change, novelty, excitement and constant arousal. That means they’re totally out of sync in traditional classes, which are analog, static, interactively passive.” (Zimbardo)
Compare this to Hanna Rosin: New data on the rise of women who argues that “the global economy is becoming a place where women are more successful than men.” She argues that there has been a hollowing out of the middle-class jobs men held for service jobs that women do better. Could the shift from a manufacturing to a service economy be responsible for the “demise of guys?”
Bloomberg has a story that Warren Buffett Says Free News Unsustainable, May Add More Papers. The days of expecting news online to be free to access may be coming to an end. We may find more and more news behind paywalls of the sort the New York Times brought in where you only get so many free articles a month.
Buffett believes that local papers with a “community focus” can make a profit as they are often the only source for community news. There will always be free alternatives for national or international news, but community newspapers often don’t have free alternatives.
This bodes well for journalism which has suffered recently which in turn has, I believe, created a democracy gap as the fifth estate loses its ability to monitor the others. Bloggers don’t reliably replace investigative journalism that profits from reporting on government and industry.
From Slashdot another story hinting at how government agencies are organizing to intercept and interpret Internet data. See FBI quietly forms secretive Net-surveillance unit.
My guess is that data mining large amounts of data produces so many false positives that organizations like the NSA and FBI have to set up large units to follow up on results. There is an interesting policy paper by Jeff Jonas and Jim Harper on Effective Counterterrorism and the Limited Role of Predictive Data Mining that argues that predictive mining isn’t worth it. The cost of false positives for industry when they use predictive data mining (predicting who might buy your product) is acceptable. The costs of false positives for counterterrorism are prohibitive as it takes trained agents away from better uses of their time. I doubt anyone in this climate it willing to give up on mining which is why The NSA is Building the Country’s Biggest Spy Center.
I wonder if we will ever know if money spent on voice and text mining is useful in counterintelligence? Perhaps the rumour of the possibility of it working is enough?
Lindsay Thomas, the hard working blogger for 4Humanities has written an excellent piece On Graduate Education in the Humanities, by a Graduate Student in the Humanities. She talks about how hard it is to complete quickly when you are making ends meet by TAing and teaching constantly. She talks about the “casualization” of academic labor.
I would add to her essay that we need to think about expanding outcomes for graduate students. We design graduate programs to produce junior faculty (or casual labor who hang on in hopes of getting full-time faculty jobs.) What we don’t do is to design programs so that they prepare people for knowledge work outside the academy. This is not rocket science, there are all sorts of ways to do it and digital humanities programs could take the lead as our student acquire skills of broader relevance. But, as Lindsay points out, if you start changing or adding to graduate programs you can just extend the time to completion and students might end up no better off.
The Globe and Mail had a very interesting article on how Twitter hands your data to the highest bidder, but not to you. The article talks about how Twitter is archiving your data, selling it, but not letting you access your old tweets. The article mentions that DataSift is one company that has been licensed to mine the Twitter archives. DataSift presents itself as the “the world’s most powerful and scalable platform for managing large volumes of information from a variety of social data sources.” In effect they do real-time text analysis for industry. Here is what they say in What we do:
DataSift offers the most powerful and sophisticated tools for extracting value from Social Data. The amount of content that Internet users are creating and sharing through Social Media is exploding. DataSift offers the best tools for collecting, filtering and analyzing this data.
Social Data is more complicated to process and analyze because it is unstructured. DataSift’s platform has been built specifically to process large volumes of this unstructured data and derive value from it.
One thing that DataSift has is a curation language called CDSL (Curated Stream Definition Language) for querying the cloud of data they gather. The provide an example of what you can with it:
Here’s an example, just for illustration, of a complex filter that you could build with only four lines of CSDL code: imagine that you want to look at information from Twitter that mentions the iPad. Suppose you want to include content written in English or Spanish but exclude any other languages, select only content written within 100 kilometers of New York City, and exclude Tweets that have been retweeted fewer than five times. You can write that in just four lines of CSDL!
It would be interesting to develop an academic alternative similar to Archive-It, but for real-time social media tracking.
The latest version of our Old Bailey Datawarehousing Interface is up. This was the Digging Into Data project that got TAPoR, Zotero and Old Bailey working together. One of the things we built was an advanced visualization environment for the Old Bailey. This was programmed by John Simpson following ideas from Joerg Sanders. Milena Radzikowska did the interface design work and I wrote emails.
One feature we have added is the broaDHcast widget that allows projects like Criminal Intent to share announcements. This was inspired partly by the issues of keeping distributed projects like TAPoR, Zotero and Old Bailey informed.
The GRAND group has a work being exhibited at the InSight: Visualizing Health Humanities show that starts tonight. We used Unity to create a FPS (First Person Shooter) type of game for medical communication. The game, called CatHETR, lets players move through a ward dealing with communicative situations. This project was supported by the GRAND Network of Centres of Excellence.