There are hundreds of thousands of hours of important spoken text audio files, dating back to the nineteenth century and up to the present day. Many of these audio files, which comprise poetry readings, interviews of folk musicians, artisans, and storytellers, and stories by elders from tribal communities contain the only recordings of significant literary figures and bygone oral traditions. These artifacts are only marginally accessible for listening and almost completely inaccessible for new forms of analysis and instruction in the digital age. For example, an Ezra Pound scholar who visits PennSound online and would like to analyze how Pound’s cadence shifts across his 1939 Harvard Vocarium Readings, his wartime radio speeches and his post-war Caedmon Recordings (June 1958) must listen to each file, one-byone, in order to establish a look at how (or if) patterns change across the collection. An Ojibwe oshkabewis (“one empowered to translate between the spiritual and mundane worlds”) seeking to teach students about the ways in which an Ojibwe elder uses Ojibwemowin (‘the Ojibwe language’) at culturally significant moments to enhance English descriptions with spiritual elements has few means to map or show students when these transitions or “traditional cultural expressions” (TCE) occur. And a scholar doing research within the Oral History of the Texas Oil Industry Records at the Doph Briscoe Center for American History can only discover the hidden recording of Robert Frost reading “Stopping by Woods on a Snowy Evening” among other poems on Side B. of folklorist William A. Owens’ recordings because a diligent archivist included that fact in the metadata.
Not only do scholars have limited access to spoken word audio, but their ability to do new kinds of research (what Jerome McGann calls “imagining what you don’t know”) and to share these methodologies with colleagues and students is almost entirely inhibited by present modes of access. What other TCE’s and important historical moments are hidden in these sound files? What if we could test hypotheses concerning the prosodic patterns of beat poets in comparison to the “high modernists” with over thirty-five thousand audio recordings in PennSound? What if we could automatically detect the difference between poetry and prose to determine when a poem is over and an author is telling us about the poem? Or determine, perhaps, whether and when the Ojibwe storytellers sound like elders from supposedly unrelated tribes? At this time, even though we have digitized hundreds of thousands of hours of culturally significant audio artifacts and have developed increasingly sophisticated systems for computational analysis of sound, there is no provision for any kind of analysis that lets one discover, for instance, how prosodic features change over time and space or how tones differ between groups of individuals and types of speech, or how one poet or storyteller’s cadence might be influenced by or reflected in another’s. There is no provision for scholars interested in spoken texts such as speeches, stories, and poetry to use or to understand how to use high performance technologies for analyzing sound.
In August 2010, the Council on Library and Information Resources and the Library of Congress issued a report titled The State of Recorded Sound Preservation in the United States: A National Legacy at Risk in the Digital Age. This report suggests that if scholars and students do not use sound archives, our cultural heritage institutions will not preserve them. Librarians and archivists need to know what scholars and students want to do with sound artifacts in order to make these collections more accessible; as well, scholars and students need to know what kinds of analysis are possible in an age of large, freely available collections and advanced computational analysis and visualization. To this end, the School of Information at the University of Texas at Austin and the Illinois Informatics Institute at the University of Illinois at Urbana-Champaign have received an NEH Institutes in Advanced Technologies in the Digital Humanities grant to host two rounds of an NEH Institute on High Performance Sound Technologies for Access and Scholarship (HiPSTAS). Humanists interested in sound scholarship, stewards of sound collections, and computer scientists and technologists versed in computational analytics and visualizations of sound will develop more productive tools for advancing scholarship in spoken text audio if they learn together about current practices, if together they create new scholarship, and if they consider the needs, resources, and possibilities of developing a digital infrastructure for the study of sound together.
HiPSTAS participants will include 20 humanities junior and senior faculty and advanced graduate students as well as librarians and archivists from across the U.S. interested in developing and using new technologies to access and analyze spoken word recordings within audio collections. The collections we will make available for participants include poetry from PennSound at the University of Pennsylvania, folklore from the Dolph Briscoe Center for American History at UT Austin, speeches from the Lyndon B. Johnson Library and Presidential Museum in Austin, and storytelling from the Native American Projects (NAP) at the American Philosophical Society in Philadelphia. Sound archivists from UT at Austin, computer scientists and technology developers from I3 at Illinois, and representatives from each of the participating collections will come together for the HiPSTAS Institute to discuss the collections, the work that researchers already do with audio cultural artifacts, and the work HiPSTAS participants can do with advanced computational analysis of sounds.
At the first four-day meeting (“A-Side”), held at the iSchool at UT May 29 – June 1, 2013, participants will be introduced to essential issues that archivists, librarians, humanities scholars, and computer scientists and technologists face in understanding the nature of digital sound scholarship and the possibilities of building an infrastructure for enabling such scholarship. At this first meeting, participants will be introduced to advanced computational analytics such as clustering, classification, and visualizations.
Participants will develop use cases for a year-long project in which they use advanced technologies to augment their research on sound. In the interim year, participants will meet virtually with the Institute Co-PI’s (Clement, Auvil, and Tcheng) and report periodically on their use cases and ongoing research within the developing environment.
In the second year, the participants would return to the HiPSTAS institute for a two-day symposium (the “B-Side” meeting) at which they would report on their year of research. In this second event, the participants will present scholarship based on these new modes of inquiry and critique the tools and approaches they have tried during the development year. This second meeting will end with a daylong session in which the group drafts recommendations for implementing HiPSTAS as an open-source, freely available suite of tools for supporting scholarship on and using audio files.
Articles on HiPSTAS and HiPSTAS mentioned in the News:
Stampede for the Humanities!
By Aaron Dubrow, Texas Advanced Computing Center
Clement, T. “The Ear and the Shunting Yard: Meaning Making as Resonance in Early Information Theory.” Information & Culture 49.4 (2014): 401-426.
Clement, T. “Word. Spoken. Articulating the Voice as Descriptive Metadata for High Performance Sound Technologies for Access and Scholarship (HiPSTAS).” In Provoke. Darren Mueller, Mary Caton Lingold and Whitney Anne Trettien (eds.) Durham, NC: Duke University Press (Forthcoming).
Clement, T. “Introducing High Performance Sound Technologies for Access and Scholarship.” The International Association of Sound and Audiovisual Archives Journal (September 2013) 41: 21-28.
Clement, T. “When Texts of Study are Audio Files: Digital Tools for Sound Studies in DH” In A New Companion to Digital Humanities (Blackwell Companions to Literature and Culture). Susan Schreibman, Ray Siemens and John Unsworth (eds.) (Forthcoming).
Clement*, T., Tcheng, D., Auvil, L., and Borries, T. “High Performance Sound Technologies for Access and Scholarship (HiPSTAS) in the Digital Humanities” Proceedings of the 77th Annual ASIST Conference, Seattle, WA, 31 October – 5 November.
Clement, T. and Roy, L., “HiPSTAS: An Institute Advancing Tools for Analyzing Digital Audio Collections,” American Indian Library Association Newsletter 36 (2013): 8-15.
Filreis, A. “Anti-ordination in the visualization of the poem’s sound” Jacket2
Francis*, H., Clement, T., Peone, G., Carpenter, B., Suagee-Beauduy, K. “Accessing Sound at Libraries, Archives, and Museums” Indigenous Ownership & Libraries, Archives, and Museums (in review).
MacArthur, M. “Monotony, the Churches of Poetry Reading, and Sound Studies.” PMLA. Forthcoming 2015.
Mustazza, C. “The noise is the content: Toward computationally determining the provenance of poetry recordings” using ARLO. Jacket2. 10 Jan. 2015.
Perez-Hernandez, D. “Scholars Collaborate to Make Sound Recordings More Accessible” The Chronicle of Higher Education. 26 March 2014.
Rettberg, E. “Hearing the Audience“. Jacket2. 26 March 2015.
Sherwood, K. “Distanced sounding: ARLO as a tool for the analysis and visualization of versioning phenomena within poetry audio.” Jacket2. 2 March 2015.
Al Filreis, Jason Camlot, Steve Evans. BEYOND THE TEXT: Literary Archives in the 21st Century. http://amodern.net/article/beyond-text/.
Christine Mitchell, Shannon Mattern. “MEDIA ARCHAEOLOGY OF POETRY AND SOUND: A Conversation with Shannon Mattern.” http://amodern.net/article/media-archaeology-poetry-sound/.