FAIR in Action: Integrating Open Source RDM-Tools Effectively

Europe/Zurich
Alte Mensa (Göttingen)

Alte Mensa

Göttingen

Wilhelmsplatz 3
Description

Too many and disconnected tools?

Over the last years, we have seen growing diversity across the RDM (research data management) landscape with new tools, new actors and standardization efforts emerging. Yet, integrating different tools remains a challenge. The event will primarily bring together users of RDM software, data stewards and developers from different open source RDM communities.Let's explore together how the FAIR Principles work in action! 

This event is a free satellite event of FDM@Campus 2025.

The presentation materials are available in this Zenodo Community.

Key topics

Key topics that will be covered during the event:

  • Practical insights into various open source RDM tools (ELNs, generic RDM software, repositories...)
  • Strategies for using different RDM tools effectively in combination
  • Steps RDM developers are taking to integrate their solutions with other existing ones
  • Opportunities for open exchange and collaboration among users and developers of different open source RDM communities

Invited Speaker

  • Rory Macneil (RSpace): Building Vertical Interoperability: RSpace as a Bridge Between Research Tools and Infrastructure
  • Rostyslav Kuzyakiv (ETHZ): FAIR Research Data Management with ELN-LIMS openBIS 
  • David Walter and Laura Bahamón Jiménez (Max Planck Digital Library): MAUS in progess - Machine Automated Support for Software Management Plans
  • Nicolas Carpi (eLabFTW): Using the open source ELN eLabFTW as part of an RDM workflow

 Formats

  • Demo sessions (e.g. to present a tool or specific functions,  45 min)
  • Talks (e.g. on use cases and best practices, 15 min)
  • Breakout sessions (interactive format regarding key topics)

How to contribute?

We invite you to submit an abstract (max. 500 words) by 29.08.2025. All information on contributions can be found here.

Organizing Team

FDM@Campus 2025 is organized by the eResearch Alliance Göttingen. This satellite event is organized by IndiScale. We are open source and RDM enthusiasts and primarily focus on the development of the open source toolkit LinkAhead.

    • Registration
    • Welcome & Opening
    • Tilo Mathes (Research Space): Building Vertical Interoperability - RSpace as a Bridge Between Research Tools and Infrastructure
      • 1
        Building Vertical Interoperability: RSpace as a Bridge Between Research Tools and Infrastructure
        Speaker: Tilo Mathes (Research Space)
    • Nicolas Carpi (eLabFTW): Using the open source ELN eLabFTW as part of an RDM workflow
      • 2
        Using the open source ELN eLabFTW as part of an RDM workflow

        Electronic Lab Notebooks are most valuable when embedded in the Research Data Management (RDM) lifecycle. This talk shows how the open-source ELN eLabFTW becomes a front door to FAIR-aligned RDM: capturing structured, provenance-rich records at creation and handing them off to institutional services.

        I’ll highlight core features (templates with required fields, permissions, audit trails, versioning/locking, attachments, exports/API) and a pragmatic workflow: capture → govern → store → describe/share → automate. Implementation lessons (template design, light controlled vocabularies, onboarding, SSO) and small automations (via the API) that trigger curation or repository deposit will be shared, along with simple indicators (template adoption, key-field completeness, time-to-deposit).

        Takeaway: eLabFTW turns routine documentation into reusable research outputs with minimal friction.

        Speaker: Nicolas Carpi (eLabFTW)
    • 3:15 PM
      Coffee Break
    • FAIR Digital Objects in all services
      • 3
        FAIR Digital Objects in all services

        Development of processes and workflows throughout the data lifecycle, guided by FAIR principles, significantly enhances data assets value, accelerates research processes, increases the data basis for AI applications and potentially leads to novel research questions across sectors. Critical services —repositories, analysis tools, and related infrastructure—must seamlessly interpret and process data across diverse software environments and domains. The FAIR Digital Objects (FDO) concept addresses this interoperability challenge by integrating essential and rich metadata with data assets references and persistent identifiers into unified, machine-actionable packages, putting "data first".

        This presentation showcases the initial results from FDO Connect, an active project within the MISSION KI initiative - National Initiative for Artificial Intelligence and Data Economy. The FDO Connect consortium develops and deploys services that enable seamless creation, storage, and reuse of data assets as FDOs. Our presentation details the FDO architecture derived from the FDO Forum (FDOF) specifications and demonstrates its capabilities through concrete use cases. This architectural approach achieves domain-independent FDO implementation through its distinctive design. The system supports dual creation pathways: handles via the type registry and nanopublications that transcend scientific domain boundaries. These parallel implementations expose the distinctive advantages of each technology stack across domains and existing data spaces. This dual-pathway strategy directly informs guidelines, quality assessment protocols, and validation tools for the emerging FDO landscape.

        MISSION KI is funded by the former Federal Ministry for Digital and Transport until the end of 2025 and implemented by acatech - National Academy of Science and Engineering.

        Speaker: Sven Bingert (GWDG)
    • Transforming a Data Monolith into a Network of FAIR Digital Objects
      • 4
        Transforming a Data Monolith into a Network of FAIR Digital Objects

        FAIR scientific data provides benefits for data consumers and producers. Consumers gain access to valuable assets which they could not produce themselves, and they are enabled to integrate foreign data with minimal effort into their workflows. Producers increase the visibility and impact of their research, with data becoming recognized as a scientific contribution on par with written publications.

        FAIR Digital Objects (FDOs), which are built upon the FAIR principles, promise additional advantages including enhanced machine-actionability, standardized interfaces for automated processing, and improved provenance tracking across distributed data networks.

        However, significant challenges remain. Data producers struggle to comply with the FAIR principles and FDO specifications, in part due to the lack of standardized workflows and best practise examples, and to the substantial expertise required for effective implementation.

        This talk presents the transformation of a monolithic data corpus into a network of FDOs. The transformation addresses two main objectives. First, the corpus is divided into data elements that are addressable at a granular scale. This allows precise linking of propositions to specific observations, accurate error reporting and versioning, and flexible recombination of dataset components. Second, controlled semantics are established across all levels of the data model. The goal is to eliminate any ambiguity for data consumers in order to increase reuse efficiency and minize the risk of misinterpretation.

        The data was originally acquired within the RoBivaL project which investigated different mobile robot designs in an agricultural setting. Data collection included high-resolution sensor measurements from several modalities, field logbooks containing structured experiment documentation, and specifications providing metadata and context about data structures and used equipment.

        The corpus was first made available on Zenodo in an effort to comply with the FAIR principles. While this initial version featured a clear layout and open formats in order to facilitate reuse, it was provided as a monolith which limited its interoperability. Further, though rich semantics were explicitly documented in the initial version, they were not codified in a standardized fashion.

        The transformed version implements the corpus as a network of FDOs, using semantic web technologies for data modeling, and Nanopublications for distribution. The data model has three layers:

        1. The Experimental Research Ontology (ERO) forms the foundational layer. It models fundamental aspects of experimental data creation in general and is aligned with several upper ontologies.

        2. The RoBivaL Specification layer uses ERO to specify RoBivaL's project methodology, define the structure and semantics of the payload data, and provide information about the used equipment.

        3. The RoBivaL Payload Data layer uses the Specification layer to capture the values of experiment parameters and sensor measurements.

        The transformation was conducted as a use case of the project FDO Connect which develops tools and methodologies to bridge traditional data management practices with emerging FDO ecosystem requirements in order to facilitate the broader adoption of FAIR principles in research communities.

        The dataset transformation showcases modular multi-layered data modeling, a practical implementation of the FDO specifications, and a best practice for FAIR-compliant usage of semantic web technologies for distributed scientific data networks.

        Speaker: Christian Backe (DFKI Robotics Innovation Center, Bremen, Germany)
    • Breakout Sessions

      This time slot offers space to discuss data management topics you are interested in and to share experiences and best practices.

      To structure this process, we have created a hedgedoc-pad which we have used to collect ideas for topics to discuss:
      General pad.

      Topic 1: How can AI help in managing research data? Balancing Innovation and Privacy: The Challenges of Using LLMs with Sensitive Data
      Pad for Topic 1

      Topic 2: What challenges do you face when integrating different tools?
      Pad for Topic 2

      Topic 3: How to build competence to use central services in decentral research activities?
      Pad for Topic 3

      Topic 4: Examples for a digital transformation of the work flow that enriches the work flow instead of a cumbersome digitization of routine processes.
      Pad for Topic 4

    • The ZMT DataPortal - Integrating LinkAhead in Institutional Data Management
      • 5
        The ZMT DataPortal - Integrating LinkAhead in Institutional Data Management

        In this talk, we introduce the web interface of the ZMT DataPortal, a cutting-edge collaboration between IndiScale GmbH and the Leibniz Centre for Tropical Marine Research (ZMT). We highlight the portal’s powerful features designed to streamline data discovery and access, and how it offers a unified, searchable database that connects datasets published by ZMT researchers, no matter where they are stored. Whether housed on ZMT's internal servers (ZMT DataCloud) or in public data repositories like PANGAEA, the DataPortal provides a comprehensive, cross-platform view of available research data. In addition, we delve into the portal’s backend, powered by LinkAhead, the open-source data management system developed by IndiScale, which ensures seamless data integration and accessibility.

        Speakers: Finn Opätz (Leibniz Centre for Tropical Marine Research (ZMT)), Helen Clara George (Leibniz Centre for Tropical Marine Research (ZMT))
    • Integrating Diverse Research Data into one Repository: The BASS metadata.xlsx Crawler
      • 6
        Integrating Diverse Research Data into one Repository: The BASS metadata.xlsx Crawler

        Initiated in 2023, the DFG Research Unit Biogeochemical Processes and Air–Sea
        Exchange in the Sea-Surface Microlayer (BASS) explores air–sea exchange processes
        through multidisciplinary field campaigns, mesocosm and laboratory experiments, and
        modeling. During the expected eight-year project duration (two phases of four years
        each), BASS encompasses nine subprojects per phase, involving 25 principal
        investigators, 50 employees and students, and IT specialists, a total of approximately 75
        potential users. Handling and sharing diverse data from these sources require strong,
        compatible research data management (RDM) workflows that follow FAIR Findable,
        Accessible, Interoperable and Reusable (FAIR) principles. However, sketching a suitable
        data model for the integration of such diverse and dynamic data in advance was out of
        scope. Our case study shows an open-source-based RDM system designed to help
        scientists without programming skills within BASS. The workflow focuses on the
        LinkAhead platform, improved by a custom automation tool called the metadata.xlsx
        crawler: This simple tool allows data-providers to enrich their metadata dynamically, by
        entering it into an Excel template spreadsheet stored alongside the corresponding data
        files. The metadata.xlsx crawler maps the entries to the data model or extends it
        automatically, if needed.

        Using a familiar, offline-capable Excel format, researchers record metadata during data
        collection with minimal technical barriers. Extracted metadata adheres to PANGAEA
        repository standards, including controlled vocabularies and provenance information,
        making repository submissions easier despite the current lack of direct export features
        from LinkAhead. This method connects simple offline metadata capture with integrated
        RDM workflows, supporting consistent, FAIR-compliant data archiving across various
        institutions and disciplines. As we prepare for BASS Phase 2, we plan to continue using
        and improving this approach. Future updates include saving search queries to improve
        metadata discoverability and enhancing metadata fields to boost findability and
        interoperability.

        In our talk, we will present:

        • Integration of open-source tools with lightweight, offline-ready Excel templates
          tailored for non-programmer scientists.
        • Lessons learned from automated metadata extraction across diverse scientific
          teams.
        • Strategies for long-term data preservation and repository readiness.
        Speaker: Mariana Ribas-Riba (ICBM - Institut für Chemie und Biologie des Meeres)
    • LinkAhead Community Meeting (invitation only)

      In this session, we will discuss future developments in the LinkAhead cosmos.
      Please note: A separate registration was needed to attend this event. If you are not sure, if you are registered, please contact fair-in-action@indiscale.com

    • David Walter and Laura Bahamón Jiménez (Max Planck Digital Library): MAUS in progess - Machine Automated Support for Software Management Plans
      • 7
        MAUS in progess - Machine Automated Support for Software Management Plans

        Like research data, research software plays a crucial role in the reproducibility of scientific results, and is therefore gaining growing recognition as a research output in itself.

        The development of research software (ranging from data-specific scripts to standalone software products) can be a major project that requires good planning and management. For example, the necessary infrastructure, such as software dependencies or hardware requirements, must be addressed, as well as the human resources needed to develop and maintain the software and write its documentation.

        Software Management Plans (SMPs) document all these requirements and thus improve software quality and reusability, in a similar way as Data Management Plans (DMPs) enhance research data. Furthermore, the structured information in SMPs is often useful in other contexts and stages of the software project and should be reusable, for example by being shareable with other tools, such as GitHub/GitLab.

        The Research Data Management Organiser (RDMO) is a well-established tool among the research community for creating DMPs and SMPs.

        In our project "Machine-AUtomated Support for Software Management Plans" (MAUS), we are developing plugins for RDMO that enhance the machine-readability and -actionability of SMPs.

        We will begin with a short introduction to SMPs and the MAUS project, then present the current state of our plugins, and conclude by giving you the option to test them yourself. We look forward to hearing your comments and ideas, and to discussing with you which further features and interfaces we should implement.

        Speakers: David Walter (Max Planck Digital Library), Laura Bahamón Jiménez (Max Planck Digital Library)
    • Rostyslav Kuzyakiv (ETHZ): FAIR Research Data Management with ELN-LIMS openBIS
      • 8
        FAIR Research Data Management with ELN-LIMS openBIS

        Research data management (RDM) in line with the FAIR (Findable, Accessible, Interoperable and Reusable) principles is increasingly recognized as an essential component of good scientific practice. In experimental disciplines, implementing FAIR RDM is particularly challenging: every step of the research process needs to be accurately documented, data must be securely stored and backed up, and sufficient metadata must be provided to ensure long-term reusability and reproducibility. An integrated Electronic Lab Notebook (ELN) and Laboratory Information Management System (LIMS) with data management capabilities can help researchers achieve these goals.

        In close collaboration with scientists, the Scientific IT Services (SIS) of ETH Zürich have developed and operated such an integrated solution, openBIS, for more than 10 years.

        This presentation will provide an overview of the openBIS software and its major features. We will show ETH Zurich openBIS use cases demonstrating how research lab data captured and managed with openBIS can be processed in a FAIR-compliant and reproducible way.

        Speaker: Rostyslav Kuzyakiv (ETHZ)
    • 10:30 AM
      Coffee Break
    • Automated Data Integration from Heterogeneous Sources using LinkAhead

      Many scientific projects rely on a multitude of different software systems for data storage and data exchange. Keeping data findable and accessible can be challenging, especially if data has to be shared between different sites, working groups and institutes. The open source software LinkAhead provides a powerful framework for managing complex data integrated from heterogeneous data sources.
      LinkAhead is built in a way that data models can be extended and adapted to future requirements by the researchers at any time. Its extendable crawler framework can be adapted to automatically import data and meta data from different repositories and ELN systems, like elabFTW or PANGAEA. LinkAhead also provides a graphical web-interface, as well as an API for automated queries.
      The software can be fine-tuned to different scientific disciplines and is already used productively at multiple Helmholtz institutes, including GEOMAR, AWI and KIT.

      • 9
        Automated Data Integration from Heterogeneous Sources using LinkAhead

        Many scientific projects rely on a multitude of different software systems for data storage and data exchange. Keeping data findable and accessible can be challenging, especially if data has to be shared between different sites, working groups and institutes. The open source software LinkAhead provides a powerful framework for managing complex data integrated from heterogeneous data sources.
        LinkAhead is built in a way that data models can be extended and adapted to future requirements by the researchers at any time. Its extendable crawler framework can be adapted to automatically import data and meta data from different repositories and ELN systems, like elabFTW or PANGAEA. LinkAhead also provides a graphical web-interface, as well as an API for automated queries.

        Speakers: Alexander Schlemmer (IndiScale GmbH), Florian Spreckelsen (IndiScale GmbH)
    • Synthesis of the Breakout Session

      In this session, we will briefly summarize the key outcomes of the breakout session from the previous day and provide some context and outlook. Afterwards, we will take a short group photo.

    • 12:30 PM
      Lunch
    • Coscine and eLabFTW: Connecting Data and Documentation Through Metadata
      • 10
        Coscine and eLabFTW: Connecting Data and Documentation Through Metadata

        Electronic Lab Notebooks (ELNs) play a pivotal role in digitizing laboratory workflows, presenting an opportunity to structure research documentation into reusable metadata. They often facilitate connections between documentation and corresponding research data. However, many ELNs face limitations — such as restricted server space — that complicate the integration of associated files directly onto their platforms.

        This presentation focuses on ongoing collaborative efforts between ELN@RWTH, the university-wide central ELN service at RWTH Aachen University, and the DFG-funded project FOR 5599. The specific ELN in question, eLabFTW, allows for data attachments to experiments; however, its university-wide instance is constrained by a storage limit of 500 GB. Thus, the university's centralized data management system, Coscine, provides the storage for larger datasets, yet currently lacks direct integration with the ELN.

        Our objective is to support the implementation of a data management policy established by FOR 5599 that stipulates: (1) all data documentation occurs within RWTH Aachen's central eLabFTW platform, and (2) all associated data files are securely stored on Coscine. Achieving this requires robust linking mechanisms across both platforms, along with minimum metadata standards for optimal findability within Coscine. Ultimately, the documentation and data are connected through their metadata, while findability is maintained on both ends.

        To streamline this process and eliminate redundant manual entry, we have developed a Python-based solution that mirrors metadata entered in eLabFTW directly to Coscine. This approach not only aims for functionality within FOR 5599 but also seeks to address broader challenges faced by other research groups using both systems.

        During this talk, we will present our proposed solution, detailing our objectives, current progress, challenges encountered along the way, and strategies employed or yet to be resolved. We will discuss essential components required for successful integration — including eLabFTW templates and Coscine metadata profiles — and showcase relevant Python code snippets.

        The significance of our work lies in its potential to serve as a proof-of-concept for a direct and flexible integration of external storage solutions in eLabFTW, as well as facilitating integrated ELN linking within Coscine. In particular, collaborative projects spanning multiple sites and dealing with large, heterogeneous datasets highlight the need for common standards in data and metadata. We will take a closer look at this and utilize it as a basis for discussion, gathering input and feedback from the audience.

        Funding Acknowledgement
        We acknowledge funding for Deniz C. Senel through FOR 5599 (DFG project no. 511114185).

        Speaker: Nicole Parks (RWTH Aachen University)
    • Scaling to global Research Data Exchange and Management using bytEM
      • 11
        Scaling to global Research Data Exchange and Management using bytEM

        bytEM is an open, modular platform for research data management that combines decentralized data ownership with centralized standards. Research units maintain full control over their data spaces and decide how and what to publish – all within a shared framework of metadata, licensing, and workflow policies. A key focus is on controlled data exchange: bytEM enables secure and flexible collaboration between internal teams and external partners via project-specific data spaces and customizable access control. Data can be shared, published, and combined either automatically or on-demand – including AI-supported or dynamically generated data products. bytEM supports both internal workflows and federated networks, allowing integratio

        Speaker: Tero Salomaa (Liberbyte GmbH)
    • Closing