What's New
corpus
Description:
The multilingual training dataset for CAP policy topic classification ParlaCAP-train is a collection of parliamentary speeches in 29 European languages, automatically annotated with 21 major policy topic labels from the ...
Ta vnos vsebuje 1 datoteko (19.52
MB).
Publicly Available
lexicalConceptualResource
Description:
Thesaurus of Modern Slovene is the largest automatically generated open-access collection of Slovene synonyms. The current version 2.2 contains 102,068 keywords and 362,464 synonyms. Nearly 6,000 entries also contain ...
Ta vnos vsebuje 1 datoteko (9.86
MB).
Publicly Available
lexicalConceptualResource
Description:
The Comprehensive Slovenian-Hungarian dictionary is a general bilingual dictionary that is being compiled at the Centre for Language Resources and Technologies of the University of Ljubljana (CJVT UL). Version 3.0 contains ...
Ta vnos vsebuje 1 datoteko (4.03
MB).
Publicly Available
Največ ogledov
V preteklem tednu
corpus
Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
Ta vnos vsebuje 18 datotek(e) (2.17
GB).
Publicly Available
corpus
Description:
The CLASSLA-web 2.0 collection is a large-scale, comparable set of web corpora covering all seven South Slavic languages: Slovenian, Croatian, Bosnian, Montenegrin, Serbian, Macedonian, and Bulgarian. This second major ...
Ta vnos vsebuje 22 datotek(e) (454.55
GB).
Publicly Available
corpus
Description:
ParlaMint 5.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 31 datotek(e) (5.94
GB).
Publicly Available