Files in this item
Download all files in item (5.19 MB)This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Name
- IMSyPP_SI_anotacije_training-clarin.csv
- Size
- 4.27 MB
- Format
- CSV file
- Description
- Training dataset: Slovenian Twitter sample labeled for hate speech type and target.
- MD5
- 0fe160a1e9ab82a723ba9b2ffee90f78

- Name
- IMSyPP_SI_anotacije_evaluation-report.txt
- Size
- 27.86 KB
- Format
- Text file
- Description
- Evaluation dataset: annotation agreement scores for the evaluation dataset.
- MD5
- a84c9b7a6c5cf99d0402160ec0d70273
Agreement report for both annotation questions: hate speech type (vrsta) and target (tarča) for the data in the file IMSyPP_SI_anotacije_evaluation-clarin.csv. Hate speech types (vrsta): 0 appropriate (ni sporni govor) 1 inappropriate (nespodobni govor) 2 offensive (žalitev) 3 violent (nasilje) Hate speech targets (tarča): 1 racism (ksenofobija in rasizem) 2 migrants (begunci/migranti) 3 islamophobia (islamofobija) 4 antisemitism (antisemitizem) 5 religion (druge religije) 6 homophobia (homofobija) 7 sexism (seksizem) 8 ideology (ideologija) 9 media (novinarji in mediji) 10 politics (politika/-i) 11 individual (posameznik) 12 other (drugo) Annotated instances a0 2000/2000 a1 2000/2000 a2 2000/2000 a3 2000/2000 a4 2000/2000 a5 2000/2000 a6 2000/2000 a7 2000/2000 a8 2000/2000 a9 2000/2000 ----------------- -----OVERALL----- ----------------- Annotated for vrsta : 20000 0 ni sporni govor 13273 1 nespodobni govor 285 2 žalitev . . .

- Name
- IMSyPP_SI_anotacije_evaluation-clarin.csv
- Size
- 883.06 KB
- Format
- CSV file
- Description
- Evaluation dataset: Slovenian Twitter random sample labeled for hate speech type and target.
- MD5
- d1a0daa22905e4f1b582cd324b4c0074

- Name
- IMSyPP_SI_anotacije_training-report.txt
- Size
- 27.49 KB
- Format
- Text file
- Description
- Training dataset: annotation agreement scores for the training dataset.
- MD5
- 756517ba847f27ea5decae1f1bc7f46c
Agreement report for both annotation questions: hate speech type (vrsta) and target (tarča) for the data in the file IMSyPP_SI_anotacije_training-clarin.csv. Hate speech types (vrsta): 0 appropriate (ni sporni govor) 1 inappropriate (nespodobni govor) 2 offensive (žalitev) 3 violent (nasilje) Hate speech targets (tarča): 1 racism (ksenofobija in rasizem) 2 migrants (begunci/migranti) 3 islamophobia (islamofobija) 4 antisemitism (antisemitizem) 5 religion (druge religije) 6 homophobia (homofobija) 7 sexism (seksizem) 8 ideology (ideologija) 9 media (novinarji in mediji) 10 politics (politika/-i) 11 individual (posameznik) 12 other (drugo) Annotated instances a0 9997 / 10000 a1 9950 / 10000 a2 9929 / 10000 a3 9992 / 10000 a4 10000 / 10000 a5 10000 / 10000 a6 9998 / 10000 a7 9979 / 10000 a8 9973 / 10000 a9 9991 / 10000 ----------------- -----OVERALL----- ----------------- Annotated for vrsta : 99809 vrsta 0 ni sporni govo . . .