Forum | 02. May 2022
Cultural data and data quality – let’s talk about problems
On 2nd May 2022, 10am - 3pm, the team Standards, Data Quality and Curation (Task Area 2) of the NFDI4Culture consortium hosts its second public virtual forum. The event focuses on frequently occurring problems and questions around data quality in the context of research. Thereby, concrete project scenarios and applications from the fields of art, musicology, media and theatre studies serve as a starting point for an open discussion. The aim of this desired exchange is to raise awareness about when and where challenges around research data quality may occur and how they can be addressed.
Ensuring data quality is a fundamental value of good scholarly practice, with the scientific communities ideally setting and monitoring specific quality requirements themselves. As defining, implementing, reviewing, and improving data quality are ongoing tasks, they are particularly difficult to put into practice for research endeavors that are limited in time and budget. From the perspective of concrete projects and with a view on working digitally in the field of cultural studies, stakeholders will shed light on these challenges as well as their approaches to solving them. In this regard, the spectrum of challenges includes ensuring quality during data collection, the use of standardised data, the (further) development of suitable vocabularies and data models as well as data mapping processes. These insights shall then lead to a joint discussion on aspects of sensible quality control as well as on institutionalised quality management that goes beyond individual projects, including what structural framework would be needed.
10.00 – 10.10 am: Welcome
10.10 – 10.40 am: On the Road to Linked Open Data – Improving the Quality of Research Data on Objects of Material Culture Using the Example of the LIDO Standard
Julia Rössel (DDK, Bildarchiv Foto Marburg) and Barbara Fichtl (SUB Göttingen)
The lecture will present the approach and some results of the BMBF-funded project "KONDA - Continuous Quality Management of Dynamic Research Data on Objects of Material Culture Using the LIDO Standard" and will further illustrate what needs to be taken into account when generating, processing and maintaining research data on objects of material culture in order to make them Linked Open Data-capable.
10.40 – 11.10 am: The Census as Linked Open Data – Some Words on a Work in Progress
Kathleen Christian and Franz Engel (HU-Berlin, Census), George Bruseker and Denitsa Nenova (Athen, takin.solutions)
Since 1946, the Census of Antique Works of Art and Architecture Known in the Renaissance has systematically collected visual and written sources on the reception of antiquities in the Renaissance. As early as 1981, the data, which until then had been collected in a card index system, were digitised on the basis of a relational data model and using the computer systems available at the time. The Census is thus one of the few projects through which the history of computerisation and digitisation in the humanities can be read without a gap. The Census aims to make its semantic Linked Open Data models FAIR by providing a robust documentation of the data models, their function and how they can be reused by scholars.
11.15 – 11.45 am: Quality Aspects of Metadata in Generic Systems Using the Example of the Project "DFG-Viewer für Musikalien"
Andrea Hammes and Sebastian Meyer (SLUB Dresden)
Developed at a time of incipient mass digitisation, the DFG-Viewer now has to rethink specific requirements of individual media types and adapt the underlying metadata schema accordingly. Using the example of the project "DFG-Viewer für Musikalien" (ZenMEM of the University of Detmold / Paderborn and SLUB Dresden), such specialising work on a system that is basically generic and widely used will be presented and discussed. In doing so, both coordination paths for such an application are to be taken into account, as well as implicit consequences and efforts: Specifications in grown systems always mean rework. Continuous maintenance and curation of old data become integral parts of planning.
11.45 – 12.05 pm: Discussion
– Lunch break –
12.50 – 1.20 pm: Of Searching and Finding – Heterogeneous Data Quality as a Challenge in the Specialised Information Service Performing Arts
Julia Beck (FID Darstellende Kunst)
Since 2015, the SIS Performing Arts has been aggregating metadata of varying quality from German-speaking libraries, archives and museums institutions in the performing arts, which are recorded in a wide variety of formats and standards. In addition to gaps in standards files such as the Integrated Authority File (GND), the heterogeneity of the artefacts and the flat modelling of entities are a particular challenge. The lecture will highlight aspects of data aggregation such as the analysis and transformation of metadata as well as the reprocessing and repurposing of standards data, which serve to provide researchers with data that is as structured, searchable and networked as possible.
1.20 – 1.50 pm: Provision, Quality and Aggregation of Correspondence Metadata
Stefan Dumont and Sascha Grabsch (Berlin-Brandenburgische Akademie der Wissenschaft)
As a material-specific information infrastructure, correspSearch has been aggregating correspondence metadata from printed and digital editions and other scholarly publications since 2014. The data is made available in a TEI-XML-based exchange format (the "CMIF") with norm data identifiers - usually by the editions, sponsoring institutions and scholars themselves. The lecture will present correspSearch's experience to date with the provision of data from the community and its aggregation.
1.55 – 2.25 pm: The DDB Metadata Quality Project - Foundations for Continuous Quality Management
Cosmina Berta and Francesca Schulze (Deutsche Nationalbibliothek)
The quality of metadata is crucial for the function and acceptance of an overarching service such as the German Digital Libary (DDB). In essence - and thus for most usage scenarios equally - the metadata must support a fast and reliable search process in the holdings. In order to improve the situation in the long term, a wide variety of measures are conceivable - ranging from advisory and mediation services to analysis and reporting tools to improvements in all processing steps.
2.25 – 2.45 pm: Discussion
2.45 – 3.00 pm: Forum Communication