Words: Anne Rahbek-Damm, DeiC
To manage a complex data flow, the SUBMERSE project employs an iterative approach to FAIRification and Data Management Plan creation, evolving with their growing understanding of the data and its dynamics.
The SUBMERSE project is making significant strides in the collection and management of oceanic data. By utilising submerged fibre-optic network cables, the project streams and processes enormous amounts of initially unintelligible data from the ocean floor. This data, which includes sensitive information such as ship and submarine movements, is continuously analysed, filtered, and managed in real time. Most of it is discarded, some stored temporarily, and a tiny fraction preserved in FAIR (Findable, Accessible, Interoperable, Reusable) repositories.
“This type of data collection is a great example for the need of interdisciplinary work and collaboration both on the researcher’s side, where the same data is used by multiple different domains, but also on the technical side. The project includes specialists for network, computing, data management and security”, says Hannah Mihai, Data Management Consultant in DeiC.
The Necessity of a Dynamic Data Management Plan
To handle this complex data flow, SUBMERSE requires a Data Management Plan (DMP) that evolves with the project. As data understanding improves, the DMP is updated to reflect better management practices. This process involves representatives from various organisations within SUBMERSE who collaborate to transform raw data into intelligible “data products” such as rapid earthquake detections, tsunami warnings, whale migration patterns, and non-military shipping intelligence.
Iterative FAIRification and DMP Development through Workshops
Supporting good data management practices, the SUBMERSE project engages in FAIRification, which enhances the Findability, Accessibility, Interoperability, and Reusability of its digital assets – the data. Given the impracticality of making all data FAIR, the focus is on gradually increasing the utilisation of valuable data by FAIRifying meaningful bits before submission to repositories.
FAIRification and DMP creation in SUBMERSE are iterative processes involving continuous debate and redrafting based on changes in data understanding and flow. This iterative approach was highlighted in two key workshops held towards the start of the project.
The first workshop provided a platform for researchers to articulate their instrumentation understanding and data needs while exploring strategic possibilities. Discussions emphasised the importance of secure data storage and the challenges of data management, particularly the scrubbing of sensitive data. Balancing data security with accessibility emerged as a key consideration.
In the second workshop, feedback was gathered to refine the DMP further. This session initiated discussions on potential uses of SUBMERSE instrumentation and the development of “SUBMERSE products” that the DMP should focus on. The collaborative effort led to the submission of the first SUBMERSE DMP towards the end of 2023, demonstrating a collective commitment to effective data management practices.
Key Findings and Insights
The refined DMP emphasises secure data storage, short-term buffering, and evolving the scrubbing methodology to filter out sensitive information late in the data flow to maximise research potential. The DMP also recognises the diversity of metadata standards and file formats across varied research communities, ensuring inclusivity and usability.
Future Directions and Ongoing Collaboration
Looking ahead, the Data Lifecycle task will continue to foster synergies with other project efforts. Collaboration with the Ethics and Security Task and the Security Advisory Board is prioritised to ensure that researchers’ needs remain central, despite security considerations. The task will monitor scientific and technical developments to offer support and guidance on data management issues, adjusting the DMP as data transitions from unintelligible noise to valuable information.
As the project progresses, the iterative development of a robust DMP, continuously updated until the project’s end, is essential to ensure that collected data is FAIR and can be used efficiently.
Visit https://submerse.eu/
This article is featured on CONNECT47, the latest issue of the GÉANT CONNECT Magazine!
Read or download the full magazine here