Llavaguard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment

19 Sept 2024, 10:25
20m
Emmy-Noether-Saal

Emmy-Noether-Saal

Session 4. Large AI Models by and for Europe 🇬🇧 Session 4. Large AI Models by and for Europe

Speaker

Lukas Helff (Hessian.AI, TU Darmstadt)

Description

We introduce LlavaGuard, a family of VLM-based safeguard models, offering a versatile framework for evaluating the safety compliance of visual content.
Specifically, we designed LlavaGuard for dataset annotation and generative model safeguarding.
To this end, we collected and annotated a high-quality visual dataset incorporating a broad safety taxonomy, which we use to tune VLMs on context-aware safety risks.
As a key innovation, LlavaGuard's responses contain comprehensive information, including a safety rating, the violated safety categories, and an in-depth rationale.
Further, our introduced customizable taxonomy categories enable the context-specific alignment of LlavaGuard to various scenarios.
Our experiments highlight the capabilities of LlavaGuard in complex and real-world applications.
We provide checkpoints ranging from 7B to 34B parameters demonstrating state-of-the-art performance, with even the smallest models outperforming baselines like GPT-4.
We make our dataset and model weights publicly available and invite further research to address the diverse needs of communities and contexts.

Presentation materials

There are no materials yet.