Najnovejše

 corpus 
corpus
Opis:
EACL Hackashop Keyword Challenge Datasets In this repository you can find ids of articles used for the keyword extraction challenge at EACL Hackashop on News Media Content Analysis and Automated Report Generation ...
 Ta vnos vsebuje 1 datoteko (224.84 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial No Derivative Works
 corpus 
corpus
Opis:
The FRENK dataset consists of comments to Facebook posts (news articles) of mainstream media outlets from Croatia, Great Britain, and Slovenia, on the topics of migrants and LGBT. The dataset contains whole discussion ...
 Ta vnos vsebuje 1 datoteko (4.17 MB).
 
Academic Use Inform Before Use Attribution Required Noncommercial
 corpus 
corpus
Opis:
Maj68 corpus contains 874 texts published between 1964 and 1972 in the periodicals "Tribuna", "Problemi" and "Problemi. Literatura." The texts contain complete bibliographical data, are classified according to text and ...
 Ta vnos vsebuje 5 datotek(e) (786.02 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike

Največ ogledov

V preteklem tednu
 corpus 
corpus
Opis:
The SOFES speech database (Spoken Flight Enquiries in Slovene) is a collection of transcribed and segmented audio recordings of spoken flight-information enquiries in Slovene. SOFES is built on the basis of the GOPOLIS ...
 Ta vnos vsebuje 3 datotek(e) (1.4 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike
 corpus 
corpus
Opis:
The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators. There are 15 Twitter corpora for the corresponding 15 European languages. The data can be used to train and evaluate ...
 Ta vnos vsebuje 16 datotek(e) (49.38 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike