The HiPSTAS project’s primary objective is to develop a virtual research environment in which users can better access and analyze spoken word collections of interest to humanists through:
- an assessment of scholarly requirements for analyzing sound
- an assessment of technological infrastructures needed to support discovery
- preliminary tests that demonstrate the efficacy of using such tools in humanities scholarship
There are hundreds of thousands of hours of important spoken text audio files, dating back to the nineteenth century and up to the present day. Many of these audio files, which comprise poetry readings, interviews of folk musicians, artisans, and storytellers, and stories by elders from tribal communities contain the only recordings of significant literary figures and bygone oral traditions. These artifacts are only marginally accessible for listening and almost completely inaccessible for new forms of analysis and instruction in the digital age. For example, an Ezra Pound scholar who visits PennSound online and would like to analyze how Pound’s cadence shifts across his 1939 Harvard Vocarium Readings, his wartime radio speeches and his post-war Caedmon Recordings (June 1958) must listen to each file, one-byone, in order to establish a look at how (or if) patterns change across the collection. An Ojibwe oshkabewis (“one empowered to translate between the spiritual and mundane worlds”) seeking to teach students about the ways in which an Ojibwe elder uses Ojibwemowin (‘the Ojibwe language’) at culturally significant moments to enhance English descriptions with spiritual elements has few means to map or show students when these transitions or “traditional cultural expressions” (TCE) occur. And a scholar doing research within the Oral History of the Texas Oil Industry Records at the Doph Briscoe Center for American History can only discover the hidden recording of Robert Frost reading “Stopping by Woods on a Snowy Evening” among other poems on Side B. of folklorist William A. Owens’ recordings because a diligent archivist included that fact in the metadata.
Not only do scholars have limited access to spoken word audio, but their ability to do new kinds of research (what Jerome McGann calls “imagining what you don’t know”) and to share these methodologies with colleagues and students is almost entirely inhibited by present modes of access. What other TCE’s and important historical moments are hidden in these sound files? What if we could test hypotheses concerning the prosodic patterns of beat poets in comparison to the “high modernists” with over thirty-five thousand audio recordings in PennSound? What if we could automatically detect the difference between poetry and prose to determine when a poem is over and an author is telling us about the poem? Or determine, perhaps, whether and when the Ojibwe storytellers sound like elders from supposedly unrelated tribes? At this time, even though we have digitized hundreds of thousands of hours of culturally significant audio artifacts and have developed increasingly sophisticated systems for computational analysis of sound, there is no provision for any kind of analysis that lets one discover, for instance, how prosodic features change over time and space or how tones differ between groups of individuals and types of speech, or how one poet or storyteller’s cadence might be influenced by or reflected in another’s. There is no provision for scholars interested in spoken texts such as speeches, stories, and poetry to use or to understand how to use high performance technologies for analyzing sound.
In August 2010, the Council on Library and Information Resources and the Library of Congress issued a report titled The State of Recorded Sound Preservation in the United States: A National Legacy at Risk in the Digital Age. This report suggests that if scholars and students do not use sound archives, our cultural heritage institutions will not preserve them. Librarians and archivists need to know what scholars and students want to do with sound artifacts in order to make these collections more accessible; as well, scholars and students need to know what kinds of analysis are possible in an age of large, freely available collections and advanced computational analysis and visualization.
To this end, the School of Information at the University of Texas at Austin and the Illinois Informatics Institute at the University of Illinois at Urbana-Champaign received a 2012 NEH Institutes in Advanced Technologies in the Digital Humanities grant to host two rounds of an Institute on High Performance Sound Technologies for Access and Scholarship (HiPSTAS) in May 2013 and May 2014. Humanists interested in sound scholarship, stewards of sound collections, and computer scientists and technologists versed in computational analytics and visualizations of sound gathered together to consider how more productive tools for advancing scholarship in spoken text audio could be developed. We also received a Preservation and Access Research and Development grant through December 2015 to specifically develop ARLO as an open source tool for use in archives and special collections.