dc.contributor.author | Pelicon, Andraž |
dc.contributor.author | Pranjić, Marko |
dc.contributor.author | Miljković, Dragana |
dc.contributor.author | Škrlj, Blaž |
dc.contributor.author | Pollak, Senja |
dc.date.accessioned | 2020-09-24T11:33:35Z |
dc.date.available | 2020-09-24T11:33:35Z |
dc.date.issued | 2020-09-15 |
dc.identifier.uri | http://hdl.handle.net/11356/1342 |
dc.description | We present a collection of sentiment annotations for news articles (article links) in Croatian language. A set of 2025 news articles was gathered from 24sata, one of the leading media companies in Croatia with the highest circulation. 6 annotators annotated the articles on the document level using a five-level Likert scale (1—very negative, 2—negative, 3—neutral, 4—positive, and 5—very positive). The final sentiment of an instance was defined as the average of the sentiment scores given by the different annotators. An instance was labeled as negative, if the average of given scores was less than or equal to 2.4; neutral, if the average of given scores was between 2.4 and 3.6; or positive, if the average of given scores was greater than or equal to 3.6. The annotation guidelines correspond to the Slovenian sentiment-annotated collection of news SentiNews 1.0 (http://hdl.handle.net/11356/1110). If you use the dataset, please cite the following paper (which contains also the details on the dataset creation, and on monolingual and cross-lingual sentiment classification experiments): Pelicon, A.; Pranjić, M.; Miljković, D.; Škrlj, B.; Pollak, S. Zero-Shot Learning for Cross-Lingual News Sentiment Classification. Appl. Sci. 2020, 10, 5993. https://doi.org/10.3390/app10175993 |
dc.language.iso | hrv |
dc.publisher | Jožef Stefan Institute |
dc.relation | info:eu-repo/grantAgreement/EC/H2020/825153 |
dc.relation.isreferencedby | https://doi.org/10.3390/app10175993 |
dc.rights | Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ |
dc.rights.label | PUB |
dc.source.uri | http://embeddia.eu/ |
dc.subject | news corpus |
dc.subject | sentiment classification |
dc.subject | manual annotation |
dc.title | Sentiment Annotated Dataset of Croatian News |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Andraž Pelicon Andraz.Pelicon@ijs.si Jožef Stefan Institute |
sponsor | European Union EC/H2020/825153 EMBEDDIA - Cross-Lingual Embeddings for Less-Represented Languages in European News Media euFunds info:eu-repo/grantAgreement/EC/H2020/825153 |
sponsor | European Union’s Rights,Equality and Citizenship Programme 875263 IMSyPP - Innovative Monitoring Systems and PreventionPolicies of Online Hate Speech Other |
sponsor | ARRS (Slovenian Research Agency) P2-103 Knowledge Technologies nationalFunds |
size.info | 2025 entries |
files.count | 1 |
files.size | 87652 |
Files in this item
This item is
Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)





- Name
- croatian_sentiment_news_document.tsv
- Size
- 85.6 KB
- Format
- Unknown
- Description
- Sentiment annotations of Croatian news
- MD5
- ca5e2d5cf131856912784230584a5389