Speaker
Georgios Kaklamanos
(GWDG)
Description
In recent months, there has been an increasing discussion about the advances of AI Technologies and their potential effects on society, particularly the negative ones.
In this talk, we will start with an overview of the current field of AI Safety and present the most prominent research agendas. Then, we will move on to interpretability research, specifically focusing on "Discovering Latent Knowledge" and "Concept Mapping" in LLMs.
Primary author
Georgios Kaklamanos
(GWDG)