Najnovejše

 toolService 
toolService
Opis:
The model for lemmatisation of non-standard Serbian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the SETimes.SR training corpus (http://hdl.handle.net/11356/1200), ...
 Ta vnos vsebuje 1 datoteko (850.72 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 toolService 
toolService
Opis:
The model for lemmatisation of non-standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1210), ...
 Ta vnos vsebuje 1 datoteko (789.52 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 toolService 
toolService
Opis:
This model for morphosyntactic annotation of non-standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handl ...
 Ta vnos vsebuje 2 datotek(e) (1.12 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike

Največ ogledov

V preteklem tednu
 corpus 
corpus
Opis:
The ParlaMeter-sl corpus contains minutes of the National Assembly of the Republic of Slovenia and currently covers its VIIth mandate (2014-08-01 to 2018-06-22). The corpus contains speaker metadata (gender, age, education, ...
 Ta vnos vsebuje 2 datotek(e) (423.98 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 languageDescription 
languageDescription
Avtor(ji):
Opis:
ELMo language model (https://github.com/allenai/bilm-tf) used to produce contextual word embeddings, trained on entire Gigafida 2.0 corpus (https://viri.cjvt.si/gigafida/System/Impressum) for 10 epochs. 1,364,064 most ...
 Ta vnos vsebuje 2 datotek(e) (212.96 MB).
 
Publicly Available
 languageDescription 
languageDescription
Avtor(ji):
Opis:
ELMo language model (https://github.com/allenai/bilm-tf) used to produce contextual word embeddings, trained on large monolingual corpora for 7 languages: Slovenian, Croatian, Finnish, Estonian, Latvian, Lithuanian and ...
 Ta vnos vsebuje 7 datotek(e) (1.35 GB).
 
Publicly Available