Life Science/Bioinformatics Workshop

Europe/Berlin
https://meet.gwdg.de/b/mar-bwn-zzi-ap8 (Online)

https://meet.gwdg.de/b/mar-bwn-zzi-ap8

Online

https://meet.gwdg.de/b/mar-bwn-zzi-ap8
Martin Leandro Paleico (GWDG)
Description

All science domains are undergoing a computerized revolution, and life science is not an exception. This requires researchers to not only be masters of their own field, but also be capable of dealing with and understanding new technologies, which allows them to expand and empower their research capabilities. Here the GWDG offers its expertise and resources: from programming to high-performance computing services, from license handling to service hosting, we strive to help our users and facilitate their day-to-day work.

The aim of this workshop is to present our currently available services in the domain of life sciences/bioinformatics, and at the same time receive feedback from our users regarding the future development of this science domain at the GWDG. This workshop will help us create the roadmap for the coming years of bioinformatics at the GWDG. If you want to influence the evolution of our life science offers, this is your chance!

Attendees are encouraged to contribute a 5-10 minute, 2-5 slides presentation. The focus point of said presentation should be their current research, and their challenges and needs with regards to technical support and high performance computing (this can be in the form of services, training, community, etc.). We hope that researchers at every stage of their career will attend and participate, to better gauge the requirements of the scientific community.

If you want to contribute a presentation, after sign-up, please send an email to Martin Paleico, with the topic "Bioinfo Workshop 2022 Presentation", and the title and a short description of your presentation in the body, and you will receive a confirmation and time slot.

Additionally, we will introduce the Bionformatics related services at the GWDG, and make showcase presentations for some of them.

This event is mainly directed to members of the University of Göttingen, Medicine University of Göttingen (UMG) and various Max Planck Institutes (MPI), but researchers and students of other institutions are also welcomed to attend.

 

Note: Please expect changes to the agenda as new presentations are added or changed.

Martin Leandro Paleico
    • 14:00 14:15
      Welcome and Motivation 15m

      Introduction and motivation/goals for this workshop

      Speakers: Julian Kunkel, Martin Leandro Paleico (GWDG)
    • 14:15 14:30
      Current Bioinformatics Services 15m

      Short presentation of the bioinformatic services services currently offered at the GWDG

      Speaker: Martin Leandro Paleico (GWDG)
    • 14:30 16:15
      Presentations: GWDG and User Presentations

      Users will present their use cases for bioinformatic tools, with an eye towards past, present and future challenges, and how the GWDG could help them with their research.

      Convener: Martin Leandro Paleico (GWDG)
      • 14:30
        Stefanie Mühlhausen/Pavan Siligam (GWDG): Alphafold on the SCC: An application of software containers in HPC 15m

        Alphafold is a recent popular tool for predictive protein folding. Due to its nature, it is a rather complex tool to install. For this reason, it benefits from the consistent and repeatable deployment made possible by container technology. An example from our own SCC.

      • 14:45
        Andreas Leha: RShiny applications support semi-automatic workflows 15m

        This is a project of the "Scientific Core Facility
        Medical Biometry and Statistical Bioinformatics" where we designed a shiny app that our user then could use to fine-tune our analyses for their data.

      • 15:00
        Hendrik Nolte (GWDG): SecureHPC: A Secure Partition to Process Sensitive Data on a Shared HPC System 15m

        Privacy and security concerns play an increasingly important role when dealing with medical and other types of sensitive data common in the life sciences. How can safety be ensured when also needing to process data off-site? An example from our own HPC services.

      • 15:15
        Martin Schulte-Rüther: R-based machine learning models as an autism detection tool 15m

        R_based machine learning models were trained on the GWDGs HPC services. These models aim to improve a diagnostic tool for autism. Rshiny was used to showcase the models in a webapp (https://msrlab.shinyapps.io/asd-ml-jcpp/). The webapp saves data to a postgresql server also provided by the GWDG.

      • 15:30
        Carsten Fortmann-Grote: RAREFAN: A publicly accessible bioinformatic pipeline on a cloud server - Implementation and limitations 15m

        In this presentation I will briefly describe the functionality and modular implementation of our RAREFAN web service, an online tool
        to find and analyse REPINs and RAYTs [1]. RAREFAN runs ontop the python flask framework and uses the redis queuing system to schedule jobs and manage task dependencies on a GWDG hosted cloud server. It drives a bioinformatic pipeline consisting of various self developed and third party programs. Results are communicated to the user in a R shiny app. I will also touch on some performance issues which could be addressed by making HPC resources accessible from the cloud server.

        Reference:
        [1] RAREFAN: a webservice to identify REPINs and RAYTs in bacterial genomes
        Carsten Fortmann-Grote, Julia Balk, Frederic Bertels
        doi: https://doi.org/10.1101/2022.05.22.493013

      • 15:45
        Jonaid Hossain: The DANCE Framework 15m

        DANCE (https://omicsml.ai/) is a PyTorch-based deep learning toolbox designed for single cell analysis. The goal of this framework is to help develop personalized deep learning models for analyzing single-cell data at scale.

      • 15:45
        RShiny at the GWDG: Showcase (If time is available/replacement) 15m
        Speaker: Martin Leandro Paleico (GWDG)
      • 16:00
        Manolis Bastakis: Employing high-performance computing services from GWDG to discover new gene regulatory networks in fungi 15m

        It has been speculated that earth hosts ca. 1.5 million species of fungi. These organisms show great variety regarding their forms, living environments, and their interactions with other organisms as well. The implications of fungi in our health, industry, and in our economy, are being enormous for centuries; becoming even more prominent year by year. Therefore, it can be easily postulated that if we want to harness the power of these fantastic organisms, we need to understand, as deeply as possible, their basic biology. Toward that goal the last few years I am combing next generation sequencing techniques together with other experimental approaches, to understand the basic developmental biology of certain fungi such as the opportunistic human-pathogenic fungus of Aspergillus fumigatus, the plant-related pathogen of Verticillium dahliae and many more. To this end, and as a part of my pipeline I utilize high-performance computing services from GWDG to elucidate so far unknown gene networks that govern the fungal development.

    • 16:15 16:30
      Break 15m
    • 16:30 17:00
      Discussion 30m

      Discussion of the user presentations

      Speaker: Martin Leandro Paleico (GWDG)
    • 17:00 17:30
      Conclusions 30m

      Collaborative elaboration of a roadmap and goals list for bioinformatics at the GWDG in the coming year.

      Speaker: Martin Leandro Paleico (GWDG)