What's New

 corpus 
corpus
Description:
This comparable corpus collection consists of Wikipedia dumps of the Bosnian, Croatian, Macedonian, Montenegrin, Serbian, Serbo-Croatian and Slovenian Wikipedia, harvested on October 17th 2020. The text was extracted from ...
 This item contains 7 files (5.04 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 toolService 
toolService
Description:
The Orange Workflow for Observing Collocation Clusters ColEmbed 1.0 ColEmbed is a workflow (.OWS file) for Orange Data Mining (an open-source machine learning and data visualization software: https://orangedatamining.com/) ...
 This item contains 1 file (86.32 MB).
 
Publicly Available
 lexicalConceptualResource 
lexicalConceptualResource
Description:
SLONEST stands for Slovene Ontologies of Semantic Types. The first subset – SLONEST-noun 1.0 – represents an ontology developed for nouns. SLONEST-noun contains an XML file with a total of 271 categories of semantic types: ...
 This item contains 1 file (58.7 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
DGT-UD is a 2 billion word 23-language parallel syntactically parsed corpus, which consists of the JRC DGT translation memory of European law, automatically annotated with UD-Pipe 1.2 (http://ufal.mff.cuni.cz/udpipe) using ...
 This item contains 24 files (24.42 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 lexicalConceptualResource 
lexicalConceptualResource
Author(s):
Description:
hrLex is a large inflectional lexicon of Croatian language where each entry consists of a (wordform, lemma, MSD, MSD features, UPOS, morphological features, frequency, per-million frequency) 8-tuple. The (wordform, lemma, ...
 This item contains 1 file (51.95 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 lexicalConceptualResource 
lexicalConceptualResource
Description:
A lexicon of 751 emoji characters with automatically assigned sentiment. The sentiment is computed from 70,000 tweets, labeled by 83 human annotators in 13 European languages. The process and analysis of emoji sentiment ...
 This item contains 3 files (93.95 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike