Show simple item record

 
dc.contributor.author Pelicon, Andraž
dc.contributor.author Pranjić, Marko
dc.contributor.author Miljković, Dragana
dc.contributor.author Škrlj, Blaž
dc.contributor.author Pollak, Senja
dc.date.accessioned 2020-09-24T11:33:35Z
dc.date.available 2020-09-24T11:33:35Z
dc.date.issued 2020-09-15
dc.identifier.uri http://hdl.handle.net/11356/1342
dc.description We present a collection of sentiment annotations for news articles (article links) in Croatian language. A set of 2025 news articles was gathered from 24sata, one of the leading media companies in Croatia with the highest circulation. 6 annotators annotated the articles on the document level using a five-level Likert scale (1—very negative, 2—negative, 3—neutral, 4—positive, and 5—very positive). The final sentiment of an instance was defined as the average of the sentiment scores given by the different annotators. An instance was labeled as negative, if the average of given scores was less than or equal to 2.4; neutral, if the average of given scores was between 2.4 and 3.6; or positive, if the average of given scores was greater than or equal to 3.6. The annotation guidelines correspond to the Slovenian sentiment-annotated collection of news SentiNews 1.0 (http://hdl.handle.net/11356/1110). If you use the dataset, please cite the following paper (which contains also the details on the dataset creation, and on monolingual and cross-lingual sentiment classification experiments): Pelicon, A.; Pranjić, M.; Miljković, D.; Škrlj, B.; Pollak, S. Zero-Shot Learning for Cross-Lingual News Sentiment Classification. Appl. Sci. 2020, 10, 5993. https://doi.org/10.3390/app10175993
dc.language.iso hrv
dc.publisher Jožef Stefan Institute
dc.relation info:eu-repo/grantAgreement/EC/H2020/825153
dc.relation.isreferencedby https://doi.org/10.3390/app10175993
dc.rights Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.label PUB
dc.source.uri http://embeddia.eu/
dc.subject news corpus
dc.subject sentiment classification
dc.subject manual annotation
dc.title Sentiment Annotated Dataset of Croatian News
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Andraž Pelicon Andraz.Pelicon@ijs.si Jožef Stefan Institute
sponsor European Union EC/H2020/825153 EMBEDDIA - Cross-Lingual Embeddings for Less-Represented Languages in European News Media euFunds info:eu-repo/grantAgreement/EC/H2020/825153
sponsor European Union’s Rights,Equality and Citizenship Programme 875263 IMSyPP - Innovative Monitoring Systems and PreventionPolicies of Online Hate Speech Other
sponsor ARRS (Slovenian Research Agency) P2-103 Knowledge Technologies nationalFunds
size.info 2025 entries
files.count 1
files.size 87652


 Files in this item

Icon
Name
croatian_sentiment_news_document.tsv
Size
85.6 KB
Format
Unknown
Description
Sentiment annotations of Croatian news
MD5
ca5e2d5cf131856912784230584a5389
 Download file

Show simple item record