| dc.contributor.author |
Rei, Luis |
| dc.contributor.author |
Krek, Simon |
| dc.contributor.author |
Mladenić, Dunja |
| dc.date.accessioned |
2016-11-28T13:47:36Z |
| dc.date.available |
2016-11-28T13:47:36Z |
| dc.date.issued |
2016-11-28 |
| dc.identifier.uri |
http://hdl.handle.net/11356/1078 |
| dc.description |
The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part-of-speech, named entities, and message-level sentiment polarity. In total, the corpus contains almost 20K annotated messages and 350K tokens.
The corpus is described in
Luis Rei, Dunja Mladenić, Simon Krek. A Multilingual Social Media Linguistic Corpus. Proceedings of the 4th Conference on CMC and Social Media Corpora for the Humanities. 27–28 September 2016, Ljubljana, Slovenia. https://nl.ijs.si/janes/cmc-corpora2016/proceedings/ |
| dc.language.iso |
spa |
| dc.language.iso |
ita |
| dc.language.iso |
deu |
| dc.publisher |
Jožef Stefan Institute |
| dc.relation |
info:eu-repo/grantAgreement/EC/FP7/611346 |
| dc.rights |
The MIT License (MIT) |
| dc.rights.uri |
https://opensource.org/licenses/mit-license.php |
| dc.rights.label |
PUB |
| dc.source.uri |
https://github.com/lrei/xlime_twitter_corpus |
| dc.subject |
social media |
| dc.subject |
computer-mediated communication |
| dc.subject |
Twitter |
| dc.subject |
part-of-speech tagging |
| dc.subject |
named entities |
| dc.subject |
sentiment classification |
| dc.subject |
multilingual |
| dc.subject |
manual annotation |
| dc.title |
xLiMe Twitter Corpus XTC 1.0.1 |
| dc.type |
corpus |
| metashare.ResourceInfo#ContentInfo.mediaType |
text |
| hidden |
false |
| hasMetadata |
false |
| has.files |
yes |
| branding |
CLARIN.SI data & tools |
| contact.person |
Luis Rei luis.rei@ijs.si Jožef Stefan Institute |
| sponsor |
ICT Programme FP7-ICT-611346 xLiMe euFunds info:eu-repo/grantAgreement/EC/FP7/611346 |
| size.info |
363994 tokens |
| size.info |
19669 texts |
| files.count |
2 |
| files.size |
6592396 |