Tuesday, May 8, 2012

Digging into Data Challenge - what's next...

While reviewing the ALA's summary of the ARL meeting in Chicago, my eyes locked into the Digging into Data Challenge.  Sponsored by various humanities and social sciences organizations, this challenge offers prizes and prestige to those who develop innovative and interesting ways of gaining insight from untapped data resources.  The first round was in 2009, with recipients proposing ways of analyzing literature, music notations, letters, speeches, images and railways to understand the human condition.  The recipients for the second round were announced just this past January.  Key projects that caught my attention especially included:

  • An Epidemiology of Information: Data Mining the 1918 Influenza Pandemic
  • Electronic Locator of Vertical Interval Successions (ELVIS): The First Large Data-Driven Research Project on Musical Style
  • Imagery Lenses for Visualizing Text Corpora
  • Digging into Connected Repositories (DiggiCORE)
  • Integrating Data Mining and Data Management Technologies for Scholarly Inquiry
One that was particularly interesting to me was (emphases added):
Digging into Metadata: Enhancing Social Science and Humanities Research
Principal Investigators: Mick Khoo, Drexel University, IMLS; Diana Massam, University of Manchester, AHRC/ESRC/JISC. Additional participating institutions: University of Glamorgan. 
Description: The project will automatically generate new forms of metadata tags from existing metadata records and associated resources that will support discovery across multiple repositories.  The project will utilize four repositories that vary in size, domain, metadata creation method and workflow, and quality.  PERTAINS, a tool developed by one of the partner schools, will be used to analyze the metadata records in each repository and then to generate Dewey Decimal Classification-based tags.  Clustering algorithms will be used to generate an index of similarity and match between resources in different repositories.  After conducting a search, the user will retrieve a list of resources from the different collections that have been tagged in similar ways. Visualization techniques will be used to display the results in ways that enhance the research process.
I look forward to seeing these completed projects in about 2 years.   

No comments:

Post a Comment