Show simple item record

 
dc.contributor.author Šnajder, Jan
dc.contributor.author Alagić, Domagoj
dc.date.accessioned 2019-03-01T11:31:19Z
dc.date.available 2019-03-01T11:31:19Z
dc.date.issued 2018-12-14
dc.identifier.uri http://hdl.handle.net/11356/1218
dc.description SenseGraph a graph-like structure of word senses of most common words of the standard Croatian language, obtained by relying on human-provided lexical substitutes for target words in context. SenseGraph is encoded in the Lexical Markup Framework (LMF; ISO 24613:2008) format. SenseGraphs consists of SenseCells, which are clusters of same-sense words obtained by grouping of words based on the similarity of their lexical substitution sets and the contexts they appear in. SenseCells can be thought of as Synsets in standard computational lexicographic terminology, albeit they exhibit more variability, which can be attributed to sense modulations in specific contexts. SenseCells are linked to each other based on loose semantic relatedness. In total, the resource covers 649 Croatian words across three different part-of-speech tags: nouns, verbs, and adjectives. More specifically, the resource contains 4,172 sentences across 230 nouns, 3,288 sentences across 200 verbs, and 4,116 sentences across 219 adjectives. Those sentences were then clustered using a lexical-substitution-based clustering method, yielding 2,877 synsets. The sentences were sampled from the SETimes.HR and hrWaC corpora. Total number of sentences: 11,576 Total number of syncells: 2,877 Total number of words: 649
dc.language.iso hrv
dc.publisher Faculty of Electrical Engineering and Computing, University of Zagreb
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.subject lexical database
dc.subject semantic lexicon
dc.subject lexical substitutes
dc.title Croatian SenseGraph 1.0
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType computationalLexicon
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
demo.uri http://sensehive.takelab.fer.hr/
contact.person Jan Šnajder jan.snajder@fer.hr TakeLab, FER, University of Zagreb
sponsor Croatian Science Foundation (HRZZ) UIP-2014-09-7312 SenseHive: Dynamic Crowdsourcing Models for Incremental Construction of Lexico-Semantic Resources nationalFunds
size.info 11576 sentences
size.info 2877 synsets
size.info 649 words
files.count 1
files.size 978873


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
SenseGraph-HR-v1.0-LMF.zip
Size
955.93 KB
Format
application/zip
Description
SenseGraph-HR-v1.0 archive
MD5
023a5855da6b6bb37819b3635fecf755
 Download file  Preview
 File Preview  

Show simple item record