What's New

 toolService 
toolService
Author(s):
Description:
The model for lemmatisation of standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1183) ...
 This item contains 1 file (81.99 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 toolService 
toolService
Author(s):
Description:
The model for lemmatisation of standard Serbian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the SETimes.SR training corpus (http://hdl.handle.net/11356/1200) ...
 This item contains 1 file (87.49 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 toolService 
toolService
Author(s):
Description:
The model for lemmatisation of standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the ssj500k training corpus (http://hdl.handle.net/11356/1210) ...
 This item contains 1 file (37.8 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike

Most Viewed Items

Top Last Week
 corpus 
corpus
Author(s):
Description:
The corpus contains 256,567 documents from the Slovenian news portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and financial content. The submission contains 7 files: ...
 This item contains 8 files (616.88 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 lexicalConceptualResource 
lexicalConceptualResource
Description:
The MULTEXT-East morphosyntactic lexicons have a simple structure, where each line is a lexical entry with three tab-separated fields: (1) the word-form, the inflected form of the word; (2) the lemma, the base-form of the ...
 This item contains 6 files (12.05 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial
 lexicalConceptualResource 
lexicalConceptualResource
Description:
Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Encoded in LMF XML, the lexicon contains approx. 100.000 most frequent Slovenian lemmas, ...
 This item contains 4 files (53.69 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike