John A. Lomax and Folklore Data

This post includes more technical details on a longer post I have included on the Sounding Out blog in which I mention that we analyzed the recordings in the UT Folklore Center Archives at the Dolph Briscoe Center for American History, The University of Texas at Austin, which comprises 57 feet of tapes (reels and audiocassettes) and covers 219 hours of field recordings (483 audio files) collected by John and Alan Lomax, Américo Paredes, and Owen Wilson, among others. We wanted to find different sonic patterns including the presence of instrumental music versus singing versus speech. The results of our analysis are noteworthy. For example, in the visualization shown in this brief movie, we see a subtle yet striking difference between the Lomax recordings (created 1926-1941), which are the oldest in the collection, and the others created up until 1968. The Lomax recordings (primarily created by John Lomax) consistently contain the least amount of speech in comparison to the other files.

UT Folklore Collection, Visualizing the predicted presence of Instruments, Speech, and Song using ARLO from Tanya Clement on Vimeo.

How was this data produced? We used the ARLO software. We tagged 4,000 randomly selected two-second windows; ARLO divided these windows into 1/32 windows.


We ended up with 93966 instrument tags, 48718 spoken tag and 81890 sung tags. With all the spectra tagged (even non-instrumental, speech, or sung), we had 25,053,489 (all spectra, all 4,000 files).

The results in the movie are shown for each file, grouped according to date across the x-axis. The dates are shown at the top of the screen. The Y-axis shows the number of seconds that each class (green=instrumental; red=spoken; and purple=sung) was predicted highest for each file. The blue bar shows the total number of seconds for each file. The movie shows a scrolling of these results across the collection according to date.

Of course, there are a number of ways you can read these results, which I’ve outlined on the longer post on the Sounding Out Blog.

Posted in Uncategorized | Comments closed

Hearing the Audience

HiPSTAS Participant Eric Rettberg has written a new piece at Jacket2 titled Hearing the Audience.

Posted in Uncategorized | Comments closed

Marit MacArthur receives ACLS digital innovation fellowship

HiPSTAS participant Marit MacArthur has received an ACLS digital innovation fellowship to develop the ARLO interface for humanists interested in pitch tracking.

Posted in Uncategorized | Comments closed

Distanced sounding: ARLO as a tool for the analysis and visualization of versioning phenomena within poetry audio

HiPSTAS Participant Kenneth Sherwood has written a new piece at Jacket2 titled Distanced sounding: ARLO as a tool for the analysis and visualization of versioning phenomena within poetry audio

Posted in Uncategorized | Comments closed

The Noise is the Content

HiPSTAS Participant Chris Mustazza has written a great piece at Jacket2 titled The noise is the content: Toward computationally determining the provenance of poetry recordings using ARLO.

Posted in Uncategorized | Comments closed

HiPSTAS wins a second grant from NEH for HRDR

Even digitized, unprocessed sound collections, which hold important cultural artifacts such as poetry readings, story telling, speeches, oral histories, and other performances of the spoken word remain largely inaccessible.

In order to increase access to recordings of significance to the humanities, Tanya Clement at the University of Texas School of Information in collaboration with David Tcheng and Loretta Auvil at the Illinois Informatics Institute at the University of Illinois, Urbana Champaign have received $250,000 of funding from the National Endowment for the Humanities Preservation and Access Office for the HiPSTAS Research and Development with Repositories (HRDR) project. Support for the HRDR project will further the work of HiPSTAS, which is currently being funded by an NEH Institute for Advanced Topics in the Digital Humanities grant to develop and evaluate a computational system for librarians and archivists for discovering and cataloging sound collections. The HRDR project will include three primary products: (1) a release of ARLO (Automated Recognition with Layered Optimization) that leverages machine learning and visualizations to augment the creation of descriptive metadata for use with a variety of repositories (such as a MySQL database, Fedora, or CONTENTdm); (2) a Drupal ARLO module for Mukurtu, an open source content management system, specifically designed for use by indigenous communities worldwide; (3) a white paper that details best practices for automatically generating descriptive metadata for spoken word digital audio collections in the humanities.

Posted in Uncategorized | Comments closed

Welcome to HiPSTAS

Welcome to HiPSTAS (High Performance Sound Technologies for Access and Scholarship). We are very excited to have received funding from the National Endowment for the Humanities to host this Institute for Advanced Topics in the Digital Humanities. As part of the HiPSTAS Institute, we will host two meetings: one in May 2013 and the second in May 2014. Between the two meetings, there will be a year of virtual consultation for use cases developed by archivists, librarians, and scholars interested in developing more productive tools for advancing digital scholarship with sound collections.

This space will change as the project progresses. Please look around.

Thank you,
Tanya Clement
Assistant Professor, School of Information
University of Texas, Austin

Posted in Uncategorized | Comments closed