What's New
corpus
Description:
ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 2.0 contains subcorpora with sentences for 17 languages: Bulgarian, Danish, ...
Ta vnos vsebuje 1 datoteko (14.08
MB).
Publicly Available
corpus
Description:
This entry contains the first part of the audiobook "Pramatija ali Bučman" (Pramatija, or the Bogeyman) by author Leopold Suhodolčan (COBISS ID: 264527107, ISBN: 978-961-7194-44-9).
Television spotlights were shining ...
Ta vnos vsebuje 9 datotek(e) (81.11
MB).
Publicly Available
corpus
Description:
This entry contains the first part of the audiobook "Rdeči lev" (The red lion) by author Leopold Suhodolčan (COBISS ID: 264850179, ISBN: 978-961-7194-48-7).
Blaž was faster and soon managed to escape them, but they still ...
Ta vnos vsebuje 6 datotek(e) (92.75
MB).
Publicly Available
Največ ogledov
V preteklem tednu
corpus
Description:
ParlaMint 5.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 31 datotek(e) (5.94
GB).
Publicly Available
corpus
Description:
The dataset represents the Twitter production in Slovenian in the period from 2018 until 2020. It consists of tweet IDs, retweet IDs, pseudo-anonymized user IDs, publication dates, and automatically assigned hate labels ...
Ta vnos vsebuje 1 datoteko (182.04
MB).
Publicly Available
corpus
Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
Ta vnos vsebuje 18 datotek(e) (2.17
GB).
Publicly Available