What's New
toolService

Description:
The SloNER is a model for Slovenian Named Entity Recognition. It is is a PyTorch neural network model, intended for usage with the HuggingFace transformers library (https://github.com/huggingface/transformers).
The model ...
This item contains 1 file (387.44
MB).
Publicly Available



corpus

Description:
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, ...
This item contains 2 files (128.43
MB).
Publicly Available


toolService

Description:
A text summarisation task aims to convert a longer text into a shorter text while preserving the essential information of the source text. In general, there are two approaches to text summarization. The extractive approach ...
This item contains 5 files (4.82
GB).
Publicly Available



Most Viewed Items
Top Last Week
corpus

Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
This item contains 18 files (2.17
GB).
Publicly Available


corpus

Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
This item contains 18 files (23.37
GB).
Publicly Available


corpus

Description:
The SUK training corpus contains about 1 million tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation, with some parts also containing further manually ...
This item contains 2 files (43.14
MB).
Publicly Available


