Coming soon . . .
Coming soon . . .
HiPSTAS is at MLA 2016 in Austin!
Thursday, 7 January
136. Close and Distant Listening to Poetry with HiPSTAS and PennSound
Program arranged by the Forum TM Libraries and Research
Presiding: Tanya E. Clement, Univ. of Texas, Austin
There are hundreds of thousands of hours of important spoken text audio files, dating back to the nineteenth century and up to the present day. These artifacts, many of which comprise poetry readings by significant literary figures, are only marginally accessible for listening and almost completely inaccessible for new forms of analysis and instruction in the digital age. Further, in August 2010, the Council on Library and Information Resources and the Library of Congress issued a report titled The State of Recorded Sound Preservation in the United States: A National Legacy at Risk in the Digital Age, which suggests that if scholars and students do not use sound archives, our cultural heritage institutions will be less inclined to preserve them. Librarians and archivists need to know what scholars and students want to do with sound artifacts in order to make these collections more accessible, but humanities scholars, arguably, also need to know what kinds of analysis are possible in an age of large, freely available digital collections and advanced computational analysis.
To be sure, computer performance, in terms of speed and storage capacity, has increased to the point where it is now possible to analyze large audio collections with high performance systems, but scholars’ abilities to do new kinds of research (what Jerome McGann calls “imagining what you don’t know”) and to share and teach these methodologies with colleagues and students is almost entirely inhibited by present modes of access. This panel addresses these issues through an introduction to the HiPSTAS (High Performance Sound Technologies for Access and Scholarship) Project. Funded by the National Endowment for the Humanities, HiPSTAS is a collaboration between the iSchool at the University of Texas, Austin and the Illinois Informatics Institute (I3) at the University of Illinois at Urbana-Champaign as well as scholars, librarians and archivists to develop new technologies for facilitating accessing and analyzing spoken word recordings.
Specifically, this panel will address what it means to “close-listen” (Bernstein 2011) and “distant-listen” (Clement 2012) to digital recordings of poetry performances.
Charles Bernstein, co-director of PennSound (the largest internet archive of poetry readings, both in terms of content and audience), closely identifies literary scholarly inquiry into sound or “close listening” with increased access, claiming that with such access, “the sound file would become . . . a text for study, much like the visual document. The acoustic experience of listening to the poem would begin to compete with the visual experience of reading the poem” (Bernstein 114). This fifteen-minute introduction will be the first MLA presentation on how PennSound’s modes of access, though freely available as downloads, are shaped not only by editorial criteria and approaches to copyright, but also by modes of funding, technical features, how the site is used by listeners both in the U.S. and globally, and the relationship the site has to institutional affiliations such as the University of Pennsylvania Libraries and the Electronic Poetry Center.
This panel will also comprise first presentations from the HiPSTAS project. The HiPSTAS team is developing ARLO (Adaptive Recognition with Layered Optimization), as a tool for “distant listening” or “investigat[ing] significant patterns within the context of a system that can translate ‘noise’ (or seemingly unintelligible information) into patterns” (Clement 2012) for interpretation. As the remaining panelists will show in three, fifteen-minute presentations, some of these patterns of interest include audience sounds, material sounds that resound from recording technologies, and performance sounds that help us discern versions from remixes of poems.
Steve McLaughlin will consider audience feedback as a distinctive feature of public poetry performance that is widely overlooked. Applause, a convention so common as to be nearly invisible, indexes the presence of an audience while conveying a general sense of its size, disposition, and perhaps the success of a given reading. Fortunately, the sonic properties of applause make it well-suited for identification through machine learning. Using measurements produced by the ARLO audio analysis tool, this presentation will tease out applause patterns in poetry recordings from the PennSound archive, with reference to region, venue, time period, and other factors.
The provenance of recordings, which can provide important clues to social, economic, and production histories, is another feature that is often lost in transcriptions. The question remains whether material provenance can be recovered from vestigial artifacts encoded in recordings as “para-sound watermarks”. In his talk, Chris Mustazza will consider whether audio analysis tools can help uncover material signatures in early poetry recordings, including some by Vachel Lindsay, Gertrude Stein, and James Weldon Johnson, originally made on aluminum records and attempt to locate other recordings of common provenance in the PennSound archive. Additional topics will include the ontological implications for audio transcodings, and connecting materiality to the conditions of (re)publication.
Kenneth Sherwood explores the opportunities for interpretation that are made available by the fact that Audio poetry archives provide scholars unprecedented access to multiple recordings of a given poem. Close listening and ethnopoetic transcription provide a methodology for the identification and description of significant paralinguistic variations but are inadequate to the scale of archives like Pennsound. Using ARLO as a visualization tool, it becomes feasible to work on the scale of the archive and to address questions of broader scope, such as: Do readings tend to increase or decrease in pace over time? Do they become more or less dynamic? Do the answers to the above questions conform to or challenge dominant notions of poetic school, style, audience, setting, region, etc.? To the extent that we find it interesting to pursue such questions, computational analysis and visualization tools may help us frame the answers.
This panel will demonstrate that infrastructures (both social and technological) that facilitate access to sound recordings have a direct impact on how we understand and teach sound cultures.
Bernstein, Charles. Attack of the Difficult Poems: Essays and Inventions. University Of Chicago Press, 2011. Print.
Clement, Tanya E. “Distant Listening: On Data Visualisations and Noise in the Digital Humanities.” Text Tools for the Arts. Digital Studies / Le champ numérique. 3.2 (2012). Web. 4 April 2015.
This post includes more technical details on a longer post I have included on the Sounding Out blog in which I mention that we analyzed the recordings in the UT Folklore Center Archives at the Dolph Briscoe Center for American History, The University of Texas at Austin, which comprises 57 feet of tapes (reels and audiocassettes) and covers 219 hours of field recordings (483 audio files) collected by John and Alan Lomax, Américo Paredes, and Owen Wilson, among others. We wanted to find different sonic patterns including the presence of instrumental music versus singing versus speech. The results of our analysis are noteworthy. For example, in the visualization shown in this brief movie, we see a subtle yet striking difference between the Lomax recordings (created 1926-1941), which are the oldest in the collection, and the others created up until 1968. The Lomax recordings (primarily created by John Lomax) consistently contain the least amount of speech in comparison to the other files.
How was this data produced? We used the ARLO software. We tagged 4,000 randomly selected two-second windows; ARLO divided these windows into 1/32 windows.
We ended up with 93966 instrument tags, 48718 spoken tag and 81890 sung tags. With all the spectra tagged (even non-instrumental, speech, or sung), we had 25,053,489 (all spectra, all 4,000 files).
The results in the movie are shown for each file, grouped according to date across the x-axis. The dates are shown at the top of the screen. The Y-axis shows the number of seconds that each class (green=instrumental; red=spoken; and purple=sung) was predicted highest for each file. The blue bar shows the total number of seconds for each file. The movie shows a scrolling of these results across the collection according to date.
Of course, there are a number of ways you can read these results, which I’ve outlined on the longer post on the Sounding Out Blog.
HiPSTAS Participant Eric Rettberg has written a new piece at Jacket2 titled Hearing the Audience.
HiPSTAS participant Marit MacArthur has received an ACLS digital innovation fellowship to develop the ARLO interface for humanists interested in pitch tracking.
HiPSTAS Participant Kenneth Sherwood has written a new piece at Jacket2 titled Distanced sounding: ARLO as a tool for the analysis and visualization of versioning phenomena within poetry audio
HiPSTAS Participant Chris Mustazza has written a great piece at Jacket2 titled The noise is the content: Toward computationally determining the provenance of poetry recordings using ARLO.
Even digitized, unprocessed sound collections, which hold important cultural artifacts such as poetry readings, story telling, speeches, oral histories, and other performances of the spoken word remain largely inaccessible.
In order to increase access to recordings of significance to the humanities, Tanya Clement at the University of Texas School of Information in collaboration with David Tcheng and Loretta Auvil at the Illinois Informatics Institute at the University of Illinois, Urbana Champaign have received $250,000 of funding from the National Endowment for the Humanities Preservation and Access Office for the HiPSTAS Research and Development with Repositories (HRDR) project. Support for the HRDR project will further the work of HiPSTAS, which is currently being funded by an NEH Institute for Advanced Topics in the Digital Humanities grant to develop and evaluate a computational system for librarians and archivists for discovering and cataloging sound collections. The HRDR project will include three primary products: (1) a release of ARLO (Automated Recognition with Layered Optimization) that leverages machine learning and visualizations to augment the creation of descriptive metadata for use with a variety of repositories (such as a MySQL database, Fedora, or CONTENTdm); (2) a Drupal ARLO module for Mukurtu, an open source content management system, specifically designed for use by indigenous communities worldwide; (3) a white paper that details best practices for automatically generating descriptive metadata for spoken word digital audio collections in the humanities.
Welcome to HiPSTAS (High Performance Sound Technologies for Access and Scholarship). We are very excited to have received funding from the National Endowment for the Humanities to host this Institute for Advanced Topics in the Digital Humanities. As part of the HiPSTAS Institute, we will host two meetings: one in May 2013 and the second in May 2014. Between the two meetings, there will be a year of virtual consultation for use cases developed by archivists, librarians, and scholars interested in developing more productive tools for advancing digital scholarship with sound collections.
This space will change as the project progresses. Please look around.
Assistant Professor, School of Information
University of Texas, Austin