Data Management and OSDU Team
Karin Becker
D.Sc.
Givanildo Santana do Nascimento
D.Sc. Student
Jaqueline Bitencourt Correia
Ph.D. Student
Our objectives
Digital Twins (DTs) and big data are mutually reinforcing technologies since huge volumes of data representing the physical/virtual worlds are collected, transformed, and generated through models to aggregate value to the business. Modern DTs follow a five-component architecture, which includes a Data Management (DM) component that bridges a physical system, a mirrored virtual one, and services components. However, there is no clarity on the functionality required for the DM component. We analyze the DM component under the big data value chain activities, highlighting key issues to be addressed (e.g., data heterogeneity, interoperability, integration, search), and proposing conceptual and technological solutions for the key challenges. Our goals are:
- To define the role and the core data management functionality for a DT targeted at the Oil & Gas industry;
- To propose a reference data management architecture aligned with the Data Lakehouse concept;
- To investigate appropriate technological platforms to support the DM component, and its interconnection with the other components (physical, virtual and services);
- To investigate the particular role, contribution and limitations of the OSDU (Open Subsurface Data Universe) platform and other industry standards as an enabling technology for data management in DTs.
Results and Contributions
Data Management in Digital Twins: a Systematic Literature Review
Jaqueline Bitencourt Correia and Karin Becker.
Comparing ARIMA and LSTM models to predict time series in the oil industry
Jaqueline Bitencourt Correia, Marcos Vinicius Ludwig Pivetta, Givanildo Santana do Nascimento and Karin Becker. Symposium on Knowledge Discovery, Mining and Learning.
Applying mining techniques in synthetic data for predictive maintenance: a case study
Rafael Schena, João Cesar Netto and Karin Becker. Brazilian Symposium on Databases.
Data Fusion Core in a Digital Twin for the Oil&Gas Industry
Jaqueline Bitencourt Correia, Mara Abel and Karin Becker. Brazilian Symposium on Databases.
Data Management in Digital Twins for the Oil and Gas Industry: beyond the OSDU Data Platform
Jaqueline Bitencourt Correia, Fabrício Henrique Rodrigues, Nicolau Oyhenard dos Santos, Mara Abel and Karin Becker. Journal of Information and Data Management.
What we are currently working on
Data Management Functionality and Reference Architecture
The functions of the DM component can be approximated to the data and knowledge management functionality in Data Lakes or their evolution, Data Lakehouse. A data lake is a scalable storage and analysis system for data of any type, retained in their native format and used mainly for knowledge extraction. It should support the integration of any type of data; support for logical and physical organization of data; accessibility to various kinds of users; metadata catalog to enforce quality and data lineage; governance and scalability in terms of storage and processing. We have systematically surveyed related works and summarized how key data management issues are addressed in DTs, advancing trends and open challenges. We are currently defining the key DM functional components under the big data key value chain and organizing them in an open, reference architecture.
OSDU Assessment
We are assessing the OSDU data platform as a means to represent data and metadata, and leverage it as a key technological component for providing DM core functionality for data management DTs in the context of O&G production. Check for preliminary results here.