27–28 Oct 2025
Max-Planck-Institut für evolutionäre Anthropologie
Europe/Berlin timezone

Automated Research Data Management for Materials Science Simulation

27 Oct 2025, 13:00
25m
Max-Planck-Institut für evolutionäre Anthropologie

Max-Planck-Institut für evolutionäre Anthropologie

Deutscher Platz 6 04103 Leipzig

Speaker

Jan Janßen (Max-Planck-Institute for Sustainable Materials / NFDI Matwerk)

Description

As part of the NFDI-Matwerk and specifically the task area workflows and lab environments we aim to reduce the overhead of research data management by automatically tracking the meta-data of our simulation and experiments to guarantee the FAIR principles in our research work. For our community we identified three challenges: (1) Managing software dependencies, (2) heterogeneity of data formats with every simulation code defining its own input and output format and (3) maintaining the provenance of our research especially when the users use both high-performance computers and their own workstation.
We addressed these challenges, by (1) managing our software dependencies with the conda package manager and maintaining over 1000 materials science software packages for the conda-forge community channel and (2,3) developing the Python-based pyiron workflow framework [1]. The pyiron workflow framework introduces a generic format which improves the interoperability of different simulation codes and utilities and integrats them with Jupyterlab to provide a coherent user interface to simplify the access to high performance computing resources.
pyiron not only drastically improves the reproducibility of our simulation workflows, but it also provides a more interactive basis for collaboration. From interactive data analysis as part of our meetings to programming hackathons, the transition from a command line interface which every user customized with their own script towards a joined environment simplified the sharing of workflows and resulted in a more efficient knowledge transfer. Finally, we are in progress of standardizing our workflow framework with other developments in the community [2] and more recently started to integrate large language model agents in our workflows to further accelerate our research [3].
The presentation covers both, the technical aspects of our research data management journey with the pyiron workflow framework in the NFDI Matwerk context as well as the human aspect of introducing automated research data management and then highlighting the benefits for the individual researchers to accelerate the adoption of it.
[1]: J. Janssen, et al., Comp. Mat. Sci. 161 (2019)
[2]: J. Janssen, et al., arXiv 2505.20366 (2025).
[3]: Z. Wang, et al., arXiv 2507.14267 (2025)

Author

Jan Janßen (Max-Planck-Institute for Sustainable Materials / NFDI Matwerk)

Co-authors

Joerg Neugebauer (Max-Planck-Institute for Sustainable Materials) Tilmann Hickel (Bundesanstalt für Materialforschung und -prüfung)

Presentation materials

There are no materials yet.