Digging into Data Winners - Digital collections

Over 85 applications were received for the international Digging into Data Challenge, and the eight winners are listed below

Structural Analysis of Large Amounts of Music Information

University of Illinois, Urbana-Champaign, University of Southampton, McGill University
SALAMI (Structural Analysis of Large Amounts of Music Information) will gather c.23,000 hours of digitised music with a breathtaking range of styles, regions and time periods: A Capella to Zydeco, Appalachia to Zambia, and Medieval to Post-Modern and develop tools to tag and analyse the underlying structures that underpin global music.

Digging Into the Enlightenment: Mapping the Republic of Letters

University of Oklahoma, University of Oxford, Stanford University
Digging into the Enlightenment: Mapping The Republic of Letters will focus on a corpus of 18th-century 53,000 letters, and will extract and interpret details relating to people, places, times, and subjects, and identify new ways of visualising and annotating these relationships.

Data Mining with Criminal Intent

George Mason University, University of Alberta, University of Hertfordshire
The Data Mining With Criminal Intent project will create an intellectual exemplar for the role of data mining in an important historical discipline–the history of crime–and illustrate how the tools of digital humanities can be used to wrest new knowledge from one of the largest humanities data sets currently available: the Old Bailey Online.

Towards Dynamic Variorum Editions

Mount Allison University, Imperial College, London, Tufts University
Towards Dynamic Variorum Editions will develop a range of tools that allow for dynamic comparison, generation of lexica, identification if topics and extraction quotations over 10,00 Greek and Roman text, that helping continue develop a fundamental resource for classical studies.

Digging into Image Data to Answer Authorship Related Questions

Michigan State University, University of Illinois, Urbana-Champaign, University of Sheffield
This project will take three specific resources (manuscripts, maps and quilts) and develop tools to analyse and identify authorship of visual images

Harvesting Speech Datasets for Linguistic Research on the Web

McGill University, Cornell University
This project will harvest audio and transcribed data from podcasts, news broadcasts, public and educational lectures and other sources to create a massive corpus of speech. Tools will then be developed to analyse the different uses of prosody (rhythm, stress and intonation) within spoken communication.

Railroads and the Making of Modern America–Tools for Spatio-Temporal Correlation, Analysis, and Visualization

University of Portsmouth, University of Nebraska-Lincoln
Railroads and the Making of Modern America will integrate a vast collection of textual, geographical and numerical data to allow for the visual presentation of the railroads over time, concentrating initially on the Great Plains and NE USA

Mining a Year of Speech

University of Oxford, University of Pennsylvania
Mining a Year of Speech will create mechanisms to allow for the rapid and flexible access to over 9000 hours of spoken audio files, drawn from some of the leading British and American spoken word corpora

The projects start in 2010 and complete in March 2011

Leave a Reply