Towards a Reliable Web of Knowledge

19 Sept 2024, 10:00
15m
Hannah-Vogt-Saal

Hannah-Vogt-Saal

Session 3. Large Language Models 🇬🇧 Session 3. Large Language Models

Speaker

Roman Matzutt (Fraunhofer FIT)

Description

The advent of Large Language Models (LLMs), most notably ChatGPT, has fascinated researchers and the public alike and LLMs. The main attractor toward LLMs is their capability to interpret prompts formulated in natural language and to respond accordingly, allowing for more organic interactions with LLM-based AI systems and increasing their accessibility especially for less tech-savvy users. LLMs gain these capabilities by being trained on a huge corpus of texts and, in the process, learning about patterns of knowledge encoded in this text corpus.

However, we now know that the truth is more complex. Namely, ChatGPT was shown to make mistakes and hallucinate if a response is statistically likely but not conforming to actual knowledge. While arguably harmless in settings motivated by curiosity and learning about the technology and its capabilities, such mistakes are prone to causing real-world damage if they occur in professional settings, become harder to identify, and actively influence our communication and decision-making processes.
Fundamentally, we argue that LLMs should be increasingly seen as what their name implies: AI models for understanding and generating human languages, which provide intuitive human-computer interfaces. Conversely, at least for critical applications, they should not be seen as knowledge models.

This perspective is motivated by the recent shift toward deploying special-purpose LLMs and finetuning existing LLMs for singular applications. This approach is sensible from the perspective of the application provider: They can seize the language-processing capabilities of advanced LLMs, such as ChatGPT, and provide application-specific knowledge in the form of additional training data.

However, this development disconnects from the global view of a universally connected Internet. In a sense, the development outlined above is even antithetical to the evolution of the Internet: In the era before search engines, users would have to know relevant websites; the advent of search engines provided them with a central hub for finding relevant information. Given that LLMs are at least partly used to get information, the above-mentioned shift toward a vast landscape of application-specific LLMs bears the potential of partly reverting this convenience if users now have to know which LLM serves their purpose best.

We argue that, from an end-user perspective, we should strive for establishing an LLM-based Web of Knowledge (WoK), which uses an LLM with versatile interpretation capabilities as a unified interface for querying relevant knowledge pools before formulating an answer. In contrast to implicit knowledge learning or relying on a federation of specialized LLMs, an LLM-based WoK could rather analyze users’ prompts for relevant knowledge repositories and gain the information from there in a structured manner before formulating an answer in natural language.

On the flip side, the idea of a WoK introduces new challenges as well, which require additional attention, such as well-defined APIs for information retrieval or the combination of knowledge obtained from different sources. In this presentation, we outline the concept of the WoK and point out further research efforts required to steer toward a WoK.

Primary authors

Roman Matzutt (Fraunhofer FIT) Dr Avikarsha Mandal (Fraunhofer FIT)

Presentation materials