Birkbeck University of London Knowledge Lab | Events | A Search and Mining System for Digital Humanities

Document Actions

A Search and Mining System for Digital Humanities

Nov 2, 2016
When Nov 02, 2016
from 05:00 PM to 06:00 PM
Where Birkbeck Main Building, Room 151
Attendees Martyn Harris
Add event to calendar vCal
iCal


Humanities researchers are faced with an overwhelming volume of digitised primary source material, and “born digital” information, of relevance to their research as a result of large-scale digitisation projects. Current digital tools do not provide consistent support for analysing the content of digital archives that are potentially large in scale, multilingual, and come in a range of data formats.  Tools are often out of reach for many research disciplines in the humanities, and can be incompatible with the way researchers locate and compare relevant sources. The Samtla (Search And Mining Tools for Language Archives) system was developed to support the exploration of digital archives by providing humanities researchers with digital tools for search, browsing, and text mining of digital archives in any domain or language, under a single system. The key to this domain-independent and language-independent digital infrastructure is a novel combination of language models and similarity measures. Comprehensive evaluation through crowd-sourcing has shown that the effectiveness of our system’s search functionality is on par with human-level performance.

Filed under: