dc.contributor.author | Šnajder, Jan |
dc.contributor.author | Alagić, Domagoj |
dc.date.accessioned | 2019-03-01T11:31:19Z |
dc.date.available | 2019-03-01T11:31:19Z |
dc.date.issued | 2018-12-14 |
dc.identifier.uri | http://hdl.handle.net/11356/1218 |
dc.description | SenseGraph a graph-like structure of word senses of most common words of the standard Croatian language, obtained by relying on human-provided lexical substitutes for target words in context. SenseGraph is encoded in the Lexical Markup Framework (LMF; ISO 24613:2008) format. SenseGraphs consists of SenseCells, which are clusters of same-sense words obtained by grouping of words based on the similarity of their lexical substitution sets and the contexts they appear in. SenseCells can be thought of as Synsets in standard computational lexicographic terminology, albeit they exhibit more variability, which can be attributed to sense modulations in specific contexts. SenseCells are linked to each other based on loose semantic relatedness. In total, the resource covers 649 Croatian words across three different part-of-speech tags: nouns, verbs, and adjectives. More specifically, the resource contains 4,172 sentences across 230 nouns, 3,288 sentences across 200 verbs, and 4,116 sentences across 219 adjectives. Those sentences were then clustered using a lexical-substitution-based clustering method, yielding 2,877 synsets. The sentences were sampled from the SETimes.HR and hrWaC corpora. Total number of sentences: 11,576 Total number of syncells: 2,877 Total number of words: 649 |
dc.language.iso | hrv |
dc.publisher | Faculty of Electrical Engineering and Computing, University of Zagreb |
dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-sa/4.0/ |
dc.rights.label | PUB |
dc.subject | lexical database |
dc.subject | semantic lexicon |
dc.subject | lexical substitutes |
dc.title | Croatian SenseGraph 1.0 |
dc.type | lexicalConceptualResource |
metashare.ResourceInfo#ContentInfo.detailedType | computationalLexicon |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
demo.uri | http://sensehive.takelab.fer.hr/ |
contact.person | Jan Šnajder jan.snajder@fer.hr TakeLab, FER, University of Zagreb |
sponsor | Croatian Science Foundation (HRZZ) UIP-2014-09-7312 SenseHive: Dynamic Crowdsourcing Models for Incremental Construction of Lexico-Semantic Resources nationalFunds |
size.info | 11576 sentences |
size.info | 2877 synsets |
size.info | 649 words |
files.count | 1 |
files.size | 978873 |
Datoteke v tem vnosu
To je vnos
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
z licenco:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Ime
- SenseGraph-HR-v1.0-LMF.zip
- Velikost
- 955.93 KB
- Format
- application/zip
- Opis
- SenseGraph-HR-v1.0 archive
- MD5
- 023a5855da6b6bb37819b3635fecf755
- SenseGraph-HR-v1.0-LMF
- README.md2 kB
- SenseGraph-HR-v1.0-LMF.xml2 MB
- WN-LMF-1.0.dtd8 kB