by Dan Cohen
I am very fortunate to live a short drive from Walden Pond, of Henry David Thoreau fame. With the hordes of summer tourists finally thinning out, and with the leaves changing with the arrival of fall, it's a good time to stroll around the pond, which we did last weekend.
Those who haven't taken the walking path around Walden Pond before are generally surprised by several things: 1) it's rather small; 2) train tracks run right next to the walking path on one side of the pond; and especially 3) Thoreau's cabin is not that far off the road, and within trivial walking distance of the center of Concord. If Thoreau were alive today, he could, on a whim, go grab some nice warm coffee and a book at a really good book store, and be back in the woods in time to light a fire for dinner.
Those amenities, of course, did not exist in the middle of the nineteenth century when Thoreau took his leave from society, but still, he was only a short stroll from other houses in the area and a mere mile and a half from his family home. New visitors to Walden realize that his off-the-grid life was a little more like grid adjacent.
It struck me on this recent visit, however, that Thoreau's perhaps not-so-radical move presents something of a model for us as we struggle with our current media environment. Maybe moving just a bit off to the side, removed but not totally ascetic, is a helpful way to approach our troubled relationship with digital media and technology.
Indeed, the "Republic of Newsletters" is just a bit off to the side, existing in niche digital eddies rather than vast digital rivers, using the old-fashioned wonder of email and even the web, but not being so world wide. Sometimes to find your humanity you must step outside of the mass of society and its current, unchallenged habits. But don't go too far. You should still be able to get a warm cup of coffee and a good book.
I'm writing to you from Tampa, where the heat and humidity makes me want to dive into the bracing chill of Walden Pond. The Digital Library Federation is having its annual forum here, and while DLF sounds like it could be a cool Star Trek thing, in actuality its spirit is closer to Thoreau than one might imagine.
The hundreds of practitioners who attend DLF every year—librarians, archivists, museum professionals, software developers, and researchers—have increasingly taken on the responsibility of thinking about how to be deliberate and considerate with our use of digital media and technology—the practice of humane ingenuity. The kinds of questions that are asked here are ones we could easily tailor to other areas of our lives: What are the kinds of human expression we should highlight and preserve, and how can we ensure diverse voices in that record? How can we present images in ways that are sensitive to how different kinds of viewers might see them and use them? How can digital tools help rather than hinder our explorations of our shared culture?
The Federation is now 25 years old. Twenty five years ago there were virtually no digital libraries; now there are countless ones, and some, like the Digital Public Library of America, have tens of millions of items from thousands of cultural heritage organizations. The next 25 years seems to be less about a rapid build-out and more about the hard work of conscientious maintenance and correcting the problems now clearly inherent in poorly designed digital platforms. And there are some exciting new methods emerging to take advantage of new computational techniques, but DLFers are dedicated to implementing them in ways that prevent social problems from emerging in the first place.
It's great to see some HI readers and old friends here at DLF. Josh Hadro of the IIIF Consortium (part of the DLF Cinematic Universe), helpfully provided some additional examples of the use of AI/ML on digital collections. The Center for Open Data in the Humanities in Japan is using machine learning to extract facial expressions and types of characters in Japanese manuscripts.
Josh and I also wistfully remembered the great potential of the NYC Space/Time Directory at NYPL, pieces of which could perhaps be revived and implemented in other contexts...
Amy Rudersdorf of AVP and Juliet Hardesty of Indiana University presented some exciting work on MGMs—metadata generation mechanisms (also part of the DLF Cinematic Universe). MGMs can include machine learning services as well as human expertise, and critically, they can be strung together in a flexible way so that you can achieve the best accuracy from the right combination of tools and human assessment. Rather than using one mechanism, or choosing between computational and human methods, MGMs such as natural language processing, facial recognition, automated transcription, OCR, and human inputs can all be employed in a single connected thread. The project Amy and Juliet outlined, the Audiovisual Metadata Platform (AMP), seems like a thoughtful and promising implementation of AI/ML to make difficult-to-index forms of human expression—such as a concert or a street protest—more widely discoverable. I will be following this project closely.
Finally, Sandy Hervieux and Amanda Wheatley are editing a new volume on artificial intelligence in the library: The Rise of AI: Implications and Applications of Artificial Intelligence in Academic Libraries. They are looking for authors to contribute chapters. Maybe that's you?