What's New
corpus

Description:
The Trendi corpus is a monitor corpus of Slovenian. It contains news articles from 106 media websites, published by 70 publishers. Trendi 2023-11 covers the period from January 2019 to November 2023, complementing the ...
Ta vnos ne vsebuje datotek.
corpus

Description:
The Ukrainian parliamentary corpus ParlaMint-UA 4.0.1 is an extended version of the ParlaMint-UA 4.0 corpus (available as a collection of plain texts along with TSV metadata of the speeches http://hdl.handle.net/11356/1859 ...
Ta vnos vsebuje 4 datotek(e) (3.84
GB).
Publicly Available


corpus

Description:
Šolar-Eval is a specialized dataset designed for the evaluation of Slovene spell- and grammar-checking tools and methodologies. It encompasses 109 essays authored by Slovene primary and secondary school students, featuring ...
Ta vnos vsebuje 4 datotek(e) (12.54
MB).
Publicly Available




Največ ogledov
V preteklem tednu
corpus

Description:
ParlaMint 4.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 30 datotek(e) (5.67
GB).
Publicly Available


toolService

Description:
The monolingual Slovene RoBERTa (A Robustly Optimized Bidirectional Encoder Representations from Transformers) model is a state-of-the-art model representing words/tokens as contextually dependent word embeddings, used for ...
Ta vnos vsebuje 2 datotek(e) (1.29
GB).
Publicly Available



toolService

Description:
Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing words/tokens as contextually dependent word embeddings, ...
Ta vnos vsebuje 3 datotek(e) (476.35
MB).
Publicly Available

