What's New
lexicalConceptualResource
Description:
This entry consists of a TSV file containing a list of 66,347 Slovene word pairs from the Sloleks Morphological Lexicon of Slovene (v2.0; http://hdl.handle.net/11356/1230) that have been automatically identified as ...
Ta vnos vsebuje 1 datoteko (2.84
MB).
Publicly Available
corpus
Description:
The Slovenian Social Assistance Rights Text Data Collection (SSAR 1.0) consists of 13 documents, including 8 legally binding texts and 5 non-legally binding texts. In total, the collection contains 6,936 sentences. The ...
Ta vnos vsebuje 2 datotek(e) (1.62
MB).
Publicly Available
corpus
Description:
GaMS-Instruct-MED is an instruction-following dataset designed to fine-tune Slovene large language models to follow instructions in the medical domain. It consists of pairs of prompts and responses from the field of medicine, ...
Ta vnos vsebuje 1 datoteko (4.58
MB).
Publicly Available
Največ ogledov
V preteklem tednu
corpus
Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 30 datotek(e) (5.87
GB).
Publicly Available
lexicalConceptualResource
Description:
Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Encoded in LMF XML, the lexicon contains approx. 100,000 most frequent Slovenian lemmas, ...
Ta vnos vsebuje 2 datotek(e) (85.8
MB).
Publicly Available
corpus
Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
Ta vnos vsebuje 18 datotek(e) (2.17
GB).
Publicly Available