What's New
corpus
Description:
Corpus of spoken Slovenian ROG-Dialog consists of volunteered audio, recorded by students by asking their relatives or acquaintances to talk on record in their homes. The speakers were directed to use various styles of ...
Ta vnos vsebuje 2 datotek(e) (1.22
GB).
Publicly Available
corpus
Description:
The Trendi corpus is a monitor corpus of Slovenian. It contains news articles from 106 media websites, published by 59 publishers. Trendi 2025-11 covers the period from January 2019 to November 2025, complementing the ...
Ta vnos ne vsebuje datotek.
corpus
Description:
The KROHOT corpus consists of 10 audio recordings of private, spontaneous conversations between two or three speakers, with a total duration of 232 minutes. Most recordings were made between May and September 2025. The ...
Ta vnos vsebuje 2 datotek(e) (878.84
MB).
Publicly Available
Največ ogledov
V preteklem tednu
corpus
Description:
The JuzneVesti-SR dataset consists of audio recordings and manual transcripts from the Južne Vesti website and its host show called '15 minuta' (https://www.juznevesti.com/Tagovi/Intervju-15-minuta.sr.html). The processing ...
Ta vnos vsebuje 7 datotek(e) (4.64
GB).
Publicly Available
corpus
Description:
ParlaMint 5.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 31 datotek(e) (5.94
GB).
Publicly Available
lexicalConceptualResource
Description:
ONTEM 1.0 comprises 1,019 manually prepared entries, each consisting of information about the lemma, part-of-speech (following the MULTEXT-East tagset for Slovenian, https://nl.ijs.si/ME/V6/msd/html/msd-sl.html), CEFR ...
Ta vnos vsebuje 2 datotek(e) (60.29
KB).
Publicly Available