Humane Ingenuity 10: The Nature and Locus of Research
by Dan Cohen
It's getting to be that time of the semester when extracurricular activities, like writing this newsletter, become rather difficult. My day job as a university adminstrator has many to-dos that crescendo in November; I will not trouble HIers with most of these, although I've also been on a special detail this fall co-chairing an initiative to highlight and expand our efforts to combine technical/data skills with human skills, about which I will write in this space in due time. It's very much in the spirit of Humane Ingenuity.
"Desakyha," Artist unknown, Cornell Ragamala Paintings Collection.
Ragamala is a unique form of Indian painting that flourished in the regional courts of the Indic world from the 16th through the 19th centuries. The term translates as a garland, mala, of ragas, meaning melodic types or tonal frameworks. Ragamala painting combines iconography, musical codes, and poetry to indicate the time of day or season appropriate to the raga and its mood.
Follow-up on GPT-2
Point: HIer Hillary Corbett noted one potentially problematic use for GPT-2 in the academy: In the constant push for more publications (encouraged, I should note, by increasingly quantified assessment of faculty research activity in many countries), researchers could use GPT-2 to generate plausible articles from fairly modest seed text. Hillary took a few lines from a chapter she wrote and got generally acceptable completion text. (Associated thought: the Sokol Hoax as an artisanal pre-GPT-2 scholarly communication deep fake.)
Counterpoint: There is now a Chrome extension that identifies GPT-2-generated text.
Again, my interest in GPT-2 has less to do with the technology than with the powerful human propensity to respond to, and often uncritically accept, expressions that fit into genres. We are genre-seeking creatures, and GPT-2 highlights a cultural version of our basic urge to fit things into categories (and also, alas, to stereotype).
I could have just as easily focused on music. For instance, earlier this year, Endel became the first AI-based generative music system to sign a deal with major record label. Like GPT-2, Endel takes music seeds and grows new music based on conforming genre norms. Since music, perhaps more than any other form of human expression, relies on repetition and slight modifications from prior art, musical genres can have an even more powerful attraction to the listener than textual genres to the reader. (Just think about music today: the reggaeton beat has powered a dozen huge hits in the last few years.)
I'll leave the last word on GPT-2 and its ilk to Janelle Shane (with appreciation from this Victorianist for the conclusion):
One of the disadvantages of having a neural net that can string together a grammatical sentence is that its sentences now can begin to be terrible in a more-human sense, rather than merely incomprehensible. It ventures into the realm of the awful simile, or the mindnumbingly repetitive, and it makes a decent stab at the 19th century style of bombastic wordiness.
The Nature and Locus of Research
One of the big issues in academia right now is the shift of much of the research in areas this newsletter has covered, such as machine learning, to the private sector. There are many reasons for this, but the main ones are that the biggest data sets and the most advanced technology are now at companies like Facebook and Google, and also these companies pay researchers far more than we can in regular faculty or postdoc positions.
This has made it increasingly hard to find and retain faculty to teach the next generation of students in many topics that are in high demand. What I want to focus on here, however, is its troubling effect on the nature of research. Corporations have always had research centers, of course, from which incredible innovations have arisen; just think about Bell Labs or Xerox PARC. Since the Second World War, there has always been a place for someone like Claude Shannon to ride through corporate hallways on a unicycle thinking about information theory, and to lay the groundwork for our modern world.
But these corporate research spaces have become much more mercenary and application-oriented in the last decade. Google's Director of Research, Peter Norvig, perhaps the archetype of the academic who left academia because (as he once put it) he had to go where the data was, is always sure to highlight that he doesn't want to replicate Bell Labs' or Xerox PARC's slightly clueless abstraction, even if great things eventually emerged from those institutions. He wants Google research to lead to new businesses and more uses of Google's search engine (even if indirectly).
Which is totally fine. But by drawing researchers fully out of the academy, we lose not only teachers and mentors, but a style of thinking and research that is different in important ways. An example: Last week on the What's New podcast I interviewed Ennio Mingolla, a scholar of human and computer vision. Ennio is brilliant, and undoubtedly would be a highly valued researcher in, say, an autonomous vehicle startup. Yet he retains academia's more expansive approach to thinking and research, in a way that is likely to be much more helpful, over time, to understanding vision.
On the podcast, Ennio and I discussed philosophy and art—knowledge from the distant past and from non-digital realms—just as much as the latest computational approaches to "seeing." We touched on empiricism, Leonardo da Vinci's discoveries in sketching and painting, and William James—not because we're fancy academics but because those topics present essential and varied theoretical approaches to the subject of vision. Freed from the right now and the near future, we can explore the ideas of those from the past who had also thought deeply about seeing, and how those concepts very well may present a helpful framing for contemporary work in the field.
Ennio is an expert in figure-ground separation, the human ability to make out an object from the scene behind it. This is a critical survival and social skill (noticing a lion in the tall grass, paying attention to faces in a crowd), and extraordinarily complicated. It's also directly related to what self-driving cars need (noticing a pedestrian in the crosswalk, paying attention to other objects in the terrain ahead). By considering vision not as a GPU-intensive task involving pixels and frames from a digital camera or LIDAR, but as a complex set of systems and skills networked in the brain, Ennio and his Computational Vision Lab are developing a much richer (and I believe more accurate) understanding of how we see. This may take decades; it has taken decades to understand even some basic visual skills such as how we sense that something is approaching us quickly (which, as Ennio notes, is a process that is nothing like what you think it is, and is both faster and slower than a computer).
Universities also have scaffolding for research that most companies don't. Institutional review boards, for instance, try to ensure that research doesn't hurt people or have unintended consequences. IRBs can be annoying friction—ask any academic researcher—but given what has happened in our world with the use of personal data over the last few years, maybe we need those brakes more than ever.
There used to be an imperfect but useful pathway for research to move from the academy to the corporate world through tech transfer. That pathway has been disrupted by the tech/data/salary gap and the fact that it's hard to find a way to share tech/data/salary between corporations and the academy. On the data front, initiatives like Social Science One, which was established to share large data sets between entities like Facebook and academic researchers, are floundering as Facebook and other giant companies hunker down in the face of criticism about privacy and their social effects. Sharing faculty between academia and corporations (in roles like affiliated, non-tenure track faculty) can be tricky to get right. Facebook, for example, only allows employees to spend 20% of their time at a university in such a role, and you can imagine which side has priority in the case of any important matter.
We need to find some new models that allow for the permeability of academia, for new kinds of partnerships, while retaining what makes thoughtful, deep academic research so critical over time. From a dean's perspective this is of some urgency, but from a social and scholarly perspective, I think it hasn't been addressed nearly enough, and will greatly affect the kinds of research and the style of research that is done in the future. And also, in the long run, limit the knowledge we produce and value.
"Todi," Artist unknown, Cornell Ragamala Paintings Collection.
The Enchantment of Archaeology Through Computers
HIer Shawn Graham, mentioned in HI8, kindly sent me a full draft of his forthcoming book, An Enchantment of Digital Archaeology: Raising the Dead with Agent Based Models, Archaeogaming, and Artificial Intelligence. I haven't had a chance to read the whole thing yet, but plan to do so over the winter break. A taste of what Shawn explores in the book:
What is more rational than a computer, reducing all phenomena down to tractable ones and zeros? What is more magical than a computer, that we tie our identities to the particular hardware or software machines we use?...Archaeology, as conventionally practiced, uses computation to effect a distancing from the world; perhaps not intentionally but practically. Its rituals (the plotting of points on a map; the carefully controlled vocabularies to encode the messiness of the world into a database and thence a report, and so on) relieves us of the task of feeling the past, of telling the tales that enable us to envision actual lives lived. The power of the computer relieves us of the burden of having to be human.
An enchanted digital archaeology remembers that when we are using computers, the computer is not a passive tool. It is an active agent in its own right (in the same way that an environment can be seen to be active)...In that emergent dynamic, in that co-creation with a non-human but active agent, we might find the enchantment, the magic of archaeology that is currently lacking in archaeology.
Citizen DJ
Finally, it was neat to see that the Library of Congress has given Brian Foo a residency for 2020. Brian was behind the very creative Data-Driven DJ project, and he will be building something called "Citizen DJ" at the LC—"an application enabling anyone with a web browser to create hip hop music with public domain audio and video materials from the Library's collections."