iLCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
- Christian Kahmann (University Leipzig, Deutschland)
- Gerhard Heyer
Abstract
The iLCM project pursues the development of an integrated research environment for the analysis of structured and unstructured data ina “Software as a Service” architecture (SaaS). The research environment addresses requirements for the quantitative evaluation of large amounts of qualitative data with text mining methods as well as requirements for the reproducibility of data-driven research designsin the social sciences. For this, the iLCM research environment comprises two central components. First, the Leipzig Corpus Miner(LCM), a decentralized SaaS application for the analysis of large amounts of news texts developed in a previous Digital Humanities project. Second, the text mining tools implemented in the LCM are extended by an “Open Research Computing” (ORC) environmentfor executable script documents, so-called “notebooks”. This novel integration allows to combine generic, high-performance methods to process large amounts of unstructured text data and with individual program scripts to address specific research requirements in computational social science and digital humanities.