dc.contributor.author | Kuvač Kraljević, Jelena |
dc.contributor.author | Hržica, Gordana |
dc.contributor.author | Štefanec, Vanja |
dc.contributor.author | Kologranić Belić, Lana |
dc.contributor.author | Ljubešić, Nikola |
dc.date.accessioned | 2021-07-02T08:32:23Z |
dc.date.available | 2021-07-02T08:32:23Z |
dc.date.issued | 2021-06-15 |
dc.identifier.uri | http://hdl.handle.net/11356/1435 |
dc.description | The corpus consists of texts produced by nonprofessional typical speakers and speakers with different language disorders (developmental language disorder, dyslexia, traumatic brain injury, aphasia, other). Roughly half of the corpus consists of texts of typical speakers, and the other half of speakers with language disorders. Language samples were elicited by six groups of tasks representing different writing styles (descriptive, expository, narrative, and letter) and different levels of formality. The corpus has been manually annotated for normalized forms, lemmas, morphosyntactic information (by following the MULTEXT-East tagset), and type of error (phonological segmentation, orthography, non-standard spelling, typo, syntax, etc.). UD morphosyntactic description has been to the most part automatically generated from the MULTEXT-East morphosyntactic information. |
dc.language.iso | hrv |
dc.publisher | Jožef Stefan Institute |
dc.publisher | Faculty of Education and Rehabilitation, University of Zagreb |
dc.relation.isreferencedby | https://hrcak.srce.hr/index.php?show=clanak&id_clanak_jezik=370152 |
dc.relation.isreferencedby | https://www.aclweb.org/anthology/L16-1513.pdf |
dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-sa/4.0/ |
dc.rights.label | PUB |
dc.subject | non-professional written language |
dc.subject | speakers with language disorders |
dc.subject | typical speakers |
dc.title | Croatian corpus of non-professional written language by typical speakers and speakers with language disorders RAPUT 1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Hržica Gordana gordana.hrzica@erf.unizg.hr Faculty of Education and Rehabilitation, University of Zagreb |
sponsor | European Structural and Investment Funds RC.2.2.08-0050 Computer assistant for text input for persons with language impairment (RAPUT) Other |
sponsor | Jožef Stefan Institute CLARIN CLARIN.SI nationalFunds |
size.info | 6760 texts |
size.info | 34469 sentences |
size.info | 426187 tokens |
files.count | 2 |
files.size | 8500768 |
Files in this item
Download all files in item (8.11 MB)This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Name
- Raput.vert.zip
- Size
- 3.04 MB
- Format
- application/zip
- Description
- Corpus in derived vertical (Sketch Engine / CQP) format
- MD5
- dd08fcb3f4ffbeebbe04785f36f87729
- Raput.vert
- raput_ncln.vert11 MB
- raput_ncln.regi2 kB
- raput_cln.vert14 MB
- raput_cln.regi2 kB
- 00README.txt973 B

- Name
- Raput.conllup.zip
- Size
- 5.06 MB
- Format
- application/zip
- Description
- Corpus in conllup format
- MD5
- d305e628ecc81d2333be183a48c813f5
- Raput.conllup
- raput.conllup33 MB
- 00README.txt480 B