What's New
corpus

Description:
The dataset was created using a large number of Serbian Legislation texts gathered from the https://www.pravno-informacioni-sistem.rs/ website. The gathered texts were used for fine-tuning a neural network called SRBerta ...
Ta vnos vsebuje 5 datotek(e) (66.42
MB).
Publicly Available


toolService

Description:
The SloNER is a model for Slovenian Named Entity Recognition. It is is a PyTorch neural network model, intended for usage with the HuggingFace transformers library (https://github.com/huggingface/transformers).
The model ...
Ta vnos vsebuje 1 datoteko (387.44
MB).
Publicly Available



corpus

Description:
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, ...
Ta vnos vsebuje 2 datotek(e) (128.43
MB).
Publicly Available


Največ ogledov
V preteklem tednu
corpus

Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
Ta vnos vsebuje 18 datotek(e) (2.17
GB).
Publicly Available


corpus

Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
Ta vnos vsebuje 18 datotek(e) (23.37
GB).
Publicly Available


corpus

Description:
The ParlaSpeech-HR dataset is built from parliamentary proceedings available in the Croatian part of the ParlaMint corpus and the parliamentary recordings available from the Croatian Parliament's YouTube channel. The corpus ...
Ta vnos vsebuje 5 datotek(e) (117.25
GB).
Publicly Available


