Speakers
Description
Text+ in an NFDI infrastructure project that engages 30+ German institutions. Each institution hosts a rich set of research data, which are described by institution-specific metadata formats. A central aim of the Text+ infrastructure is to support users to easily discover and access language-related research data via a central portal, the Text+ registry, where users can perform a combination of faceted and full-text search. Given the multi-discipline background of Text+ partners, which is reflected by the task areas Collections, Editions, and Lexical Resources, a single registry interface, which makes use of a single metadata scheme that underlies all data, cannot be devised; the nature of the data differs across resource types. To implement facet-based search, metadata from the 30+ data providers must be mapped to the registry's facets, which until now, was mostly done by the registry team using its domain modelling environment and individual negotiations with the data providers. It shows that the mapping process is not trivial. While this one-to-one negotiation has been effective but labour-intensive, it misses to capture and better understand the commonalities that exist across the metadata of all providers. In this poster, we report on a single, common metadata scheme for Lexical Resources, which is now being used by all data providers of lexical resources so that the registry's mapping process becomes trivial for this resource type. Moreover, the common metadata scheme for Lexical Resources is built upon the component-based metadata infrastructure. In this CMDI scheme, only a single component is used to describe lexical resources in detail; the other components were designed to be re-usable as they capture bibliographic, administrative, or technical metadata that is shared across resource types. The new scheme serves as a blueprint to the metadata providers of the other TA, further easing the mapping strain from the registry team.