Open foundation models: reproducible science of transferable learning

19 Sept 2024, 11:50
30m
Hannah-Vogt-Saal

Hannah-Vogt-Saal

Session 6. Multi-Modal Foundation Models 🇬🇧 Session 6. Multi-Modal Foundation Models

Speaker

Dr Jenia Jitsev (Forschungszentrum Jülich)

Description

Breakthroughs in strongly transferable learning were achieved by training models that use simple, generic losses and large amounts of generic, diverse web-scale data. Crucial for the progress was increasing pre-training scales, that is model, compute and dataset scales employed in the training. Derived scaling laws suggest that generalization and transferability improve when increasing scales hand in hand. Studying learning at such large scales is challenging, as it requires corresponding datasets at sufficiently large scales to be available, sufficient compute resources to execute the training, while handling properly distributed training across thousands of compute nodes without suffering instabilities. We show how work done by LAION community made the whole pipeline for training strongly transferable multi-modal models of various kind (openCLIP, openFlamingo) - termed foundation models - fully open and reproducible. We show how important experiments necessary for studying such models, for instance those leading to scaling laws derivation, critically depend on the open and reproducible nature of such pipelines - requiring also open-sourcing dataset composition and model benchmark evaluation procedures. We conclude with an outlook on studying next generation of open multi-modal foundation models that possess stronger and more robust generalization, and datasets necessary for their creation.

Presentation materials

There are no materials yet.