What's New

 corpus 
corpus
Description:
This corpus can be used to build and evaluate methods for extracting and presenting knowledge based on a semantic hypergraph. The corpus consists of 184 simple, complex and dependently complex sentences. All sentences are ...
 This item contains 1 file (21.66 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 corpus 
corpus
Description:
SuperGLUE is a benchmark styled after GLUE with a new set of more difficult language understanding tasks, improved resources, and a public leaderboard. It is comprised of 8 corpora (BoolQ, CB, COPA, MultiRC, ReCoRD, RTE, ...
 This item contains 3 files (25.43 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 toolService 
toolService
Description:
This model for morphosyntactic annotation of standard Macedonian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the 1984 training corpus (to be published) and ...
 This item contains 2 files (234.18 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original (about 100,000 words in length), and its translations ...
 This item contains 1 file (14.12 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike
 lexicalConceptualResource 
lexicalConceptualResource
Description:
A lexicon of 751 emoji characters with automatically assigned sentiment. The sentiment is computed from 70,000 tweets, labeled by 83 human annotators in 13 European languages. The process and analysis of emoji sentiment ...
 This item contains 3 files (93.95 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 lexicalConceptualResource 
lexicalConceptualResource
Description:
Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Encoded in LMF XML, the lexicon contains approx. 100,000 most frequent Slovenian lemmas, ...
 This item contains 2 files (85.8 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike