Prikaži enostavni zapis vnosa

 
dc.contributor.author Pahor de Maiti Tekavčič, Kristina
dc.date.accessioned 2025-05-13T09:56:40Z
dc.date.available 2025-05-13T09:56:40Z
dc.date.issued 2025-05-09
dc.identifier.uri http://hdl.handle.net/11356/2030
dc.description The Frenk-MRW dataset contains French and Slovene socially unacceptable Facebook comments that are manually annotated for metaphor and metonymy based on the observed incongruity between the basic and contextual meaning. The comments were posted between 2015 and 2017 under Facebook posts produced by major news media outlets on the topics of LGBTQIA+/homophobia and migration/islamophobia. This entry includes the dataset divided into four files in CSV format, two with French comments (metadata: meta_fr, metaphor/metonymy annotations: mrw_fr) and two with Slovene comments (metadata: meta_sl, metaphor/metonymy annotations: mrw_sl). Attached are also annotation guidelines and a README file explaining the file structure, both formatted as TXT files. The dataset uses a selection of Slovene socially unacceptable comments from FRENK 1.1 (http://hdl.handle.net/11356/1462) and French socially unacceptable comments from FRENK-fr 1.0 (http://hdl.handle.net/11356/1947). French data from FRENK-fr 1.0 was linguistically annotated with the FreeLing tagger (https://aclanthology.org/L12-1224/), while Slovene data from FRENK 1.1 was processed using CLASSLA tagger (http://hdl.handle.net/11356/1337). Manual annotations were performed in a WebAnno deployment (webanno.github.io/webanno) hosted at CLARIN.SI. FRENK-MRW represent a set of comments, 2,000 in total, that is based on a selection of news items (POST_CONTENT (NEWS) column) which were chosen according to two criteria: (1) for ease of annotation and interpretation, the entire thread of comments needed to be included (excluding acceptable comments from the annotation), and (2) the total amount of available comments linked to these news posts had to reach 2,000 comments equally distributed between the two languages (French, Slovene) and the two topics (migrants, LGBT). The French part of the dataset includes posts from Le Figaro and 20 minutes, with LGBT-related news coming only from the latter. In the Slovene part, the posts on both topics (migrants and LGBT) come from Nova24TV, Siol.net and 24ur. There are 2,000 comments in the dataset with 84,738 tokens. Not all comments contain metaphors. In the French part, 541 comments contain at least one metaphorically used token, while in the Slovene part of the dataset this number amounts to 571 comments. In total, there are 1,192 metaphorically used tokens in the French part of the dataset, and 1,270 in the Slovene part.
dc.language.iso eng
dc.language.iso fra
dc.language.iso slv
dc.publisher Faculty of Arts, University of Ljubljana
dc.publisher Institute of Contemporary History
dc.publisher CY Cergy Paris University
dc.rights CLARIN.SI Licence ACA ID-BY-NC-INF-NORED 1.0
dc.rights.uri https://clarin.si/repository/xmlui/page/licence-aca-id-by-nc-inf-nored-1.0
dc.rights.label ACA
dc.subject offensive language
dc.subject hate speech
dc.subject user comment
dc.subject social media
dc.subject metaphor
dc.subject metonymy
dc.title French and Slovene offensive language metaphor and metonymy annotated dataset FRENK-MRW 1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Kristina Pahor de Maiti Tekavčič kristina.pahordemaiti@ff.uni-lj.si Faculty of Arts, University of Ljubljana
sponsor Slovenian Research Agency (ARIS) P6-0436 Digital humanities: resources, tools and methods nationalFunds
sponsor CY Cergy Paris University (Paris Seine Initiative), EU (EUTOPIA) 22IAGOD744 The linguistic landscape of hateful discourse online in France and Slovenia Other
sponsor Slovenian Research Agency (ARIS) N6-0099 LiLaH: Linguistic Landscape of Hate Speech nationalFunds
sponsor Jožef Stefan Institute CLARIN CLARIN.SI nationalFunds
size.info 84738 tokens
size.info 2000 texts
files.count 1
files.size 1906250


 Datoteke v tem vnosu

To je vnos
Academic Use
z licenco:
CLARIN.SI Licence ACA ID-BY-NC-INF-NORED 1.0
Inform Before Use Attribution Required Noncommercial
Icon
Ime
frenk-mrw.zip
Velikost
1.82 MB
Format
application/zip
Opis
README, annotation guidelines, CSV metadata and data
MD5
35fe363e51534bb5302e46206687de97
 Prenesi datoteko

Prikaži enostavni zapis vnosa