What's New

 corpus 
corpus
Description:
The Trendi corpus is a monitor corpus of Slovenian. It contains news articles from 106 media websites, published by 59 publishers. Trendi 2025-12 covers the period from January 2019 to December 2025, complementing the ...
 This item contains no files.
 lexicalConceptualResource 
lexicalConceptualResource
Description:
This digital dictionary of papermaking was made on the basis of the printed edition, i.e. Marjeta Humar (ed.) Papirniški terminološki slovar. 1996. ZRC SAZU (https://doi.org/10.3986/961618220X). It is an explanatory, ...
 This item contains 3 files (3.84 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 lexicalConceptualResource 
lexicalConceptualResource
Description:
The dataset contains 59,598 collocation-distractor pairs for 2,856 headwords. Distractor is defined as an incorrect answer/alternative to collocation, which can be similar to collocation meaning and/or form. Headwords and ...
 This item contains 1 file (1.46 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
The Map task corpus of heritage Bosnian/Croatian/Montenegrin/Serbian (BCMS) consists of elicited conversations (map tasks) by 29 second-generation BCMS speakers originating from different regions of former Yugoslavia and ...
 This item contains 2 files (751.91 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike
 corpus 
corpus
Description:
The JuzneVesti-SR dataset consists of audio recordings and manual transcripts from the Južne Vesti website and its host show called '15 minuta' (https://www.juznevesti.com/Tagovi/Intervju-15-minuta.sr.html). The processing ...
 This item contains 7 files (4.64 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike