dc.contributor.author | Vasić, Daniel |
dc.contributor.author | Žitko, Branko |
dc.contributor.author | Gašpar, Angelina |
dc.contributor.author | Ljubešić, Nikola |
dc.contributor.author | Štrkalj Despot, Kristina |
dc.contributor.author | Merkler, Danijela |
dc.date.accessioned | 2020-11-20T16:41:42Z |
dc.date.available | 2020-11-20T16:41:42Z |
dc.date.issued | 2020-11-20 |
dc.identifier.uri | http://hdl.handle.net/11356/1377 |
dc.description | This corpus can be used to build and evaluate methods for extracting and presenting knowledge based on a semantic hypergraph. The corpus consists of 184 simple, complex and dependently complex sentences. All sentences are marked on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation, syntactic dependencies, named entities, and semantic roles. This resource also includes, a representation of a subset of 176 sentences in the form of a semantic hypergraph that can be used to evaluate knowledge extraction methods for Croatian. The sentences used in this corpora are taken from the textbook: Hudeček, L., Mihaljević, M., Sršen, J. and Čamagajevac, S. (2017). Hrvatska Školska Gramatika. Zagreb: Institut za hrvatski jezik i jezikoslovlje. https://gramatika.hr/impresum/ |
dc.language.iso | hrv |
dc.publisher | University of Mostar |
dc.publisher | University of Split |
dc.publisher | Jožef Stefan Institute |
dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-sa/4.0/ |
dc.rights.label | PUB |
dc.source.uri | https://www.acnltutor.net/ |
dc.subject | knowledge extraction |
dc.subject | knowledge representation |
dc.subject | semantics |
dc.subject | semantic role labeling |
dc.title | Semantic hypergraph corpus SemCRO 1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
demo.uri | https://bitbucket.org/danielvasic/croatiangraphbrain |
contact.person | Daniel Vasić daniel.vasic@fpmoz.sum.ba University of Mostar |
sponsor | Office of Naval Research N00014-20-1-2066 Enhancing Adaptive Courseware based on Natural Language Processing nationalFunds |
size.info | 184 sentences |
size.info | 176 semanticUnits |
files.count | 1 |
files.size | 22175 |
Files in this item
This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Name
- semcro.v1.zip
- Size
- 21.66 KB
- Format
- application/zip
- Description
- Test corpus in extened CoNLL-U (184 sentences) and gold hypergraph (subset of 176 sentences).
- MD5
- a48538e5aa3ddfe57c5bf9e2282c164d
- semcro.v1
- semcro.test.conll133 kB
- README.txt1 kB
- semcro.hypergraph.hg9 kB