Show simple item record

 
dc.contributor.author Brglez, Mojca
dc.contributor.author Zayed, Omnia
dc.contributor.author Buitelaar, Paul
dc.date.accessioned 2025-11-19T12:57:15Z
dc.date.available 2025-11-19T12:57:15Z
dc.date.issued 2023-01-24
dc.identifier.uri http://hdl.handle.net/11356/1787
dc.description TCMeta is a dataset of noun phrase constructions from COVID-related tweets, annotated for relation-level metaphor. It contains 2,138 Slovene and 2,221 English instances in tab-separated tabular format .tsv, where each line presents a unique phrase under consideration, extracted from a COVID-related tweet. The primary annotations include the COVID metaphor label (whether the phrase expresses a metaphor relating to COVID), but also additional ones for idioms, metaphors not relating to COVID, or metaphors not evident on the relation-level. The complete user tweet could not be published due to the ToS of the then Twitter platform. We recommend retrieving the text of the tweets via their IDs using the Hydrator tool [https://github.com/docnow/hydrator] or similar. The dataset is further described in: Brglez, M., Zayed, O. & Buitelaar, P. TCMeta: a multilingual dataset of COVID tweets for relation-level metaphor analysis. Lang Resources & Evaluation 59, 437–475 (2025). https://doi.org/10.1007/s10579-024-09725-z. @article{brglez2025tcmeta, title={{TCMeta}: a multilingual dataset of {COVID} tweets for relation-level metaphor analysis}, author={Brglez, Mojca and Zayed, Omnia and Buitelaar, Paul}, journal={Language Resources and Evaluation}, pages={437--475}, volume={59}, year={2025}, publisher={Springer}, doi = {10.1007/s10579-024-09725-z} }
dc.language.iso eng
dc.language.iso slv
dc.publisher Faculty of Arts, University of Ljubljana
dc.relation info:eu-repo/grantAgreement/EC/H2020/883285
dc.relation.isreferencedby https://doi.org/10.1007/s10579-024-09725-z
dc.relation.replaces https://doi.org/10.5281/zenodo.16921580
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.subject metaphor
dc.subject Twitter
dc.subject social media
dc.subject COVID-19
dc.subject manual annotation
dc.title Multilingual dataset of COVID tweets for relation-level metaphor analysis TCMeta 1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Mojca Brglez mojca.brglez@ff.uni-lj.si Faculty of Arts, University of Ljubljana
sponsor European Union EC/H2020/883285 PANDEM-2 - Pandemic Preparedness and Response euFunds info:eu-repo/grantAgreement/EC/H2020/883285
sponsor Science Foundation Ireland SFI/12/RC/2289_P2 Insight nationalFunds
sponsor University of Ljubljana P6-0215 Slovene Language - Basic, Contrastive, and Applied Studies nationalFunds
size.info 4359 entries
files.count 2
files.size 234483


 Files in this item

 Download all files in item (228.99 KB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Distributed under Creative Commons Attribution Required
Icon
Name
README.txt
Size
2.39 KB
Format
Text file
Description
description
MD5
425c9b3f580725daba40877c2d73eecc
 Download file  Preview
 File Preview  
TCMeta is a multilingual dataset of COVID tweets for relation-level metaphor analysis.

It contains 2,138 Slovene and 2,221 English noun phrase constructions extracted from COVID-related tweets that are annotated for relation-level metaphor.


The data is in tab-separated tabular format .tsv. Each line presents a unique phrase, extracted from a COVID-related tweet. 

The primary annotations can be found in the column "COVID metaphor label" (whether the phrase expresses a metaphor relating to COVID). Additional annotations can be found in the "Comments" column, and include annotations of idioms, metaphors not relating to COVID, and metaphors not evident on the relation-level.


The data contains the following columns:

Language		the language of the tweet, 'sl' (Slovene) or 'en' (English) 
Tweet ID		the unique identifier of the tweet, which can be used to retrieve the text of the post
Phrase			the phrase extracted from the tweet 
COVID metaphor label	'y' (Yes) or 'n' (No): whether it is . . .
                                            
Icon
Name
TCMeta.v1.tsv
Size
226.6 KB
Format
Unknown
Description
dataset
MD5
a61a7c336d66d87f3822be638a744cb9
 Download file

Show simple item record