What's New
corpus

Description:
The Trendi corpus is a monitor corpus of Slovenian. It contains news articles from 106 media websites, published by 56 publishers. Trendi 2025-04 covers the period from January 2019 to April 2025, complementing the Gigafida ...
This item contains no files.
corpus

Description:
ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 1.3 contains sentences for 10 languages: Bulgarian, Danish, English, Spanish, ...
This item contains 1 file (11.08
MB).
Publicly Available



corpus

Description:
The Uganke corpus collects 2,790 Slovenian riddles from the folklore collection of the Institute of Slovenian Ethnology. The riddles come from 171 sources: fieldwork, newspapers, journals, manuscripts and printed riddle ...
This item contains 4 files (3.74
MB).
Publicly Available


Most Viewed Items
Top Last Week
corpus

Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 30 files (5.87
GB).
Publicly Available


toolService

Description:
The Q-CAT (Querying-Supported Corpus Annotation Tool) is a computational tool for manual annotation of language corpora, which also enables advanced queries on top of these annotations. The tool has been used in various ...
This item contains 1 file (1.09
MB).
Publicly Available
lexicalConceptualResource

Description:
elexiko is an online information system ("dictionary") on contemporary German language (mainly post World War II), which documents, explains and scientifically comments on the vocabulary of the German language on the basis ...
This item contains no files.