Show simple item record

 
dc.contributor.author Rozman, Tadeja
dc.contributor.author Stritar Kučuk, Mojca
dc.contributor.author Kosem, Iztok
dc.contributor.author Krek, Simon
dc.contributor.author Krapš Vodopivec, Irena
dc.contributor.author Arhar Holdt, Špela
dc.contributor.author Stabej, Marko
dc.contributor.author Laskowski, Cyprian
dc.contributor.author Klemenc, Bojan
dc.date.accessioned 2018-11-21T16:52:08Z
dc.date.available 2018-11-21T16:52:08Z
dc.date.issued 2018-07-29
dc.identifier.uri http://hdl.handle.net/11356/1150
dc.description Šolar-Clear is an adapted version of the Šolar 1.0 corpus, cf. http://hdl.handle.net/11356/1036. The Šolar(-Clear) corpus consists of texts written by students in Slovene primary and secondary schools. School essays form the majority of the corpus (64.2%) while other material includes texts created during lessons, such as text recapitulations or descriptions, examples of formal applications etc. Unlike the original Šolar corpus, Šolar-Clear only includes student texts while language corrections and other types of feedback from the teachers are not included. The corpus can thus be used for processing tasks where the inclusion of corrections hinders or complicates the procedures (e.g. for comparative data extraction, training of language models etc).
dc.language.iso slv
dc.publisher Trojina, Institute for Applied Slovene Studies
dc.relation.replaces http://hdl.handle.net/11356/1036
dc.relation.isreplacedby http://hdl.handle.net/11356/1219
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-nc-sa/4.0/
dc.rights.label PUB
dc.source.uri http://ssj.slovenscina.eu/korpusi/solar
dc.subject student writing
dc.subject developmental corpus
dc.title Developmental corpus of Slovene (without language corrections) Šolar-Clear
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
hidden hidden
has.files yes
branding CLARIN.SI data & tools
contact.person Iztok Kosem iztok.kosem@trojina.si Trojina, Institute for Applied Slovene Studies
sponsor ARRS (Slovenian Research Agency) I0-0051 Centre for Applied Linguistics (CUJ) nationalFunds
sponsor Ministry of Culture 3340-15-141006 Upgrade of Šolar Corpus nationalFunds
size.info 1098107 tokens
size.info 2703 texts
files.count 1
files.size 6414297


 Files in this item

Icon
Name
solar-clear.zip
Size
6.12 MB
Format
application/zip
Description
Corpus in XML
MD5
b1f68ebff540a72c7a930a1c036c6584
 Download file  Preview
 File Preview  
    • solar-clear.xml61 MB
    • solar-clear.rnc1 kB
    • 00README.txt228 B

Show simple item record