Knowledge Injection Strategies for Large Language Models

19 Sept 2024, 10:15
15m
Hannah-Vogt-Saal

Hannah-Vogt-Saal

Session 3. Large Language Models 🇬🇧 Session 3. Large Language Models

Speaker

Benjamin Wolff (ZB MED - Information Centre for Life Sciences)

Description

During pre-training a Large Language Model (LLM) learns language features such as syntax, grammar and, to a certain degree, semantics. In this process, the model not only acquires language, but also implicitly acquires knowledge. In this sense knowledge is a byproduct of language acquisition. This characteristic is inherent in the architecture of modern LLMs. Hence, much like language features, knowledge is also learned in a fuzzy manner by these models, leading to difficulties in understanding highly specific concepts and a tendency to generate hallucinations.

This talk will explore various techniques and strategies for sharpening fuzzy knowledge in LLMs with domain-specific information. Here, our focus is on lightweight methods, that require no further pre-training. We will examine direct text injection strategies for LLM encoders (such as triple injection and K-BERT), integration of additional features using multilayer perceptrons (MLPs), and the use of modular lightweight components with knowledge-based adapters. Furthermore, we will investigate injection techniques for LLM decoders that go beyond simple prompting, such as RAGs, and how these can be enhanced by a multi-agent architecture.

Primary author

Benjamin Wolff (ZB MED - Information Centre for Life Sciences)

Co-author

Konrad Förstner (GNOI)

Presentation materials

There are no materials yet.