Prikaži enostavni zapis vnosa

 
dc.contributor.author Evkoski, Bojan
dc.contributor.author Pelicon, Andraž
dc.contributor.author Mozetič, Igor
dc.contributor.author Ljubešić, Nikola
dc.contributor.author Kralj Novak, Petra
dc.date.accessioned 2021-07-21T17:02:01Z
dc.date.available 2021-07-21T17:02:01Z
dc.date.issued 2021-07-20
dc.identifier.uri http://hdl.handle.net/11356/1423
dc.description The dataset represents the Twitter production in Slovenian in the period from 2018 until 2020. It consists of tweet IDs, retweet IDs, pseudo-anonymized user IDs, publication dates, and automatically assigned hate labels (acceptable, inappropriate, offensive, violent) with https://huggingface.co/IMSyPP/hate_speech_slo. The dataset is the basis for the two following papers: - "Retweet communities reveal the main source of hate speech" - https://arxiv.org/pdf/2105.14898.pdf - "Community evolution in retweet networks" - https://arxiv.org/pdf/2105.06214.pdf
dc.language.iso slv
dc.publisher Jožef Stefan Institute
dc.relation.isreferencedby https://arxiv.org/pdf/2105.14898.pdf
dc.relation.isreferencedby https://arxiv.org/pdf/2105.06214.pdf
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri http://imsypp.ijs.si
dc.subject Twitter
dc.subject hate speech
dc.subject retweet networks
dc.title Slovenian Twitter dataset 2018-2020 1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Nikola Ljubešić nikola.ljubesic@ijs.si Jožef Stefan Institute
sponsor European Union’s Rights,Equality and Citizenship Programme 875263 IMSyPP - Innovative Monitoring Systems and PreventionPolicies of Online Hate Speech Other
sponsor ARRS (Slovenian Research Agency) N6-0099 LiLaH: Linguistic Landscape of Hate Speech nationalFunds
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor ARRS (Slovenian Research Agency) P2-103 Knowledge Technologies nationalFunds
size.info 12961136 texts
files.count 1
files.size 190882288


 Datoteke v tem vnosu

Icon
Ime
clarin_plos.zip
Velikost
182.04 MB
Format
application/zip
Opis
Dataset in CSV format
MD5
58a693968c40b81b4cf483265e918a6a
 Prenesi datoteko  Predogled
 Predogled datoteke  
    • README.txt843 B
    • clarin_plos_15072021.csv609 MB

Prikaži enostavni zapis vnosa