What's New
lexicalConceptualResource
Description:
This entry consists of a TSV file containing a list of 66,347 Slovene word pairs from the Sloleks Morphological Lexicon of Slovene (v2.0; http://hdl.handle.net/11356/1230) that have been automatically identified as ...
This item contains 1 file (2.84
MB).
Publicly Available
corpus
Description:
The Slovenian Social Assistance Rights Text Data Collection (SSAR 1.0) consists of 13 documents, including 8 legally binding texts and 5 non-legally binding texts. In total, the collection contains 6,936 sentences. The ...
This item contains 2 files (1.62
MB).
Publicly Available
corpus
Description:
GaMS-Instruct-MED is an instruction-following dataset designed to fine-tune Slovene large language models to follow instructions in the medical domain. It consists of pairs of prompts and responses from the field of medicine, ...
This item contains 1 file (4.58
MB).
Publicly Available
Most Viewed Items
Top Last Week
corpus
Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 30 files (5.87
GB).
Publicly Available
corpus
Description:
ParlaMint 4.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 30 files (5.67
GB).
Publicly Available
corpus
Description:
ParlaMint 3.0 is a multilingual set of 26 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2022, with the individual corpora being between 9 and 125 million words in size.
The ...
This item contains 27 files (5.22
GB).
Publicly Available