Distilling Graph Structure Knowledge into Code Language Models for Generative Tasks

19 Sept 2024, 10:05
20m
Emmy-Noether-Saal

Emmy-Noether-Saal

Session 4. Large AI Models by and for Europe 🇬🇧 Session 4. Large AI Models by and for Europe

Speaker

Mert Tiftikci (Hessian.AI, TU Darmstadt)

Description

Although pre-trained language models (PLMs) on code get significantly better, code is largely treated as sequential.
By ignoring easily extractable structural rules -- through static analysis -- that programming languages and algorithmic concepts follow, significant potential for improvement is lost.
Some previous work used abstract syntax trees (ASTs) and their extended versions by extracting paths, flattening or using graph-based auxiliary training, etc., which have been shown to improve either performance or reliability in code generation and program synthesis.
Most of these methods disrupt the nature of graph modality by applying such adaptations to work with transformer-based sequential models, which are currently state of the art.
We propose a novel method to work directly with graph representations by using graph neural networks (GNNs) and infusing learned structural information into sequential transformer models.
In doing so, the learned structural knowledge from GNNs is distilled into PLM to help with generative tasks where the target is a programming language and therefore there are no graphs, as opposed to code summarization or search tasks.

Presentation materials

There are no materials yet.