Contemporary scientific experiments produce significant amount of data as well as scientific publications based on this data. Since volumes of both are constantly increasing, it becomes more and more problematic to establish a connection between a given paper and the underlying data. However, such an association is one of the crucial pieces of information for performing various tasks, such as validating the scientific results presented in paper, comparing different approaches to deal with a problem or even simply understanding the situation in some area of science. Authors of this paper are working under the Data Knowledge Base (DKB) R&D project, initiated in 2016 to solve this issue for the ATLAS experiment at CERN. This project is aimed at developing of the software environment, providing the storage and a coherent representation of the basic information objects. In this paper authors present a metadata model developed for the ATLAS experiment, the architecture of the DKB system and its main components. Special attention is paid to the Kafka-based ETL subsystem implementation and mechanism for extraction of meta-information from the texts of ATLAS publications.
|Журнал||Journal of Physics: Conference Series|
|Состояние||Опубликовано - 18 окт 2018|
|Опубликовано для внешнего пользования||Да|
|Событие||18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research, ACAT 2017 - Seattle, Соединенные Штаты Америки|
Продолжительность: 21 авг 2017 → 25 авг 2017
ASJC Scopus subject areas
- Physics and Astronomy(all)