What's New
lexicalConceptualResource
Description:
This entry consists of a TSV file containing a list of 66,347 Slovene word pairs from the Sloleks Morphological Lexicon of Slovene (v2.0; http://hdl.handle.net/11356/1230) that have been automatically identified as ...
This item contains 1 file (2.84
MB).
Publicly Available
corpus
Description:
The Slovenian Social Assistance Rights Text Data Collection (SSAR 1.0) consists of 13 documents, including 8 legally binding texts and 5 non-legally binding texts. In total, the collection contains 6,936 sentences. The ...
This item contains 2 files (1.62
MB).
Publicly Available
corpus
Description:
GaMS-Instruct-MED is an instruction-following dataset designed to fine-tune Slovene large language models to follow instructions in the medical domain. It consists of pairs of prompts and responses from the field of medicine, ...
This item contains 1 file (4.58
MB).
Publicly Available
Most Viewed Items
Top Last Week
corpus
Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 30 files (5.87
GB).
Publicly Available
corpus
Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
This item contains 18 files (2.17
GB).
Publicly Available
corpus
Description:
ParlaMint 4.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 30 files (5.67
GB).
Publicly Available