dc.contributor.author | Rozman, Tadeja |
dc.contributor.author | Stritar Kučuk, Mojca |
dc.contributor.author | Kosem, Iztok |
dc.contributor.author | Krek, Simon |
dc.contributor.author | Krapš Vodopivec, Irena |
dc.contributor.author | Arhar Holdt, Špela |
dc.contributor.author | Stabej, Marko |
dc.contributor.author | Laskowski, Cyprian |
dc.contributor.author | Klemenc, Bojan |
dc.date.accessioned | 2018-11-21T16:52:08Z |
dc.date.available | 2018-11-21T16:52:08Z |
dc.date.issued | 2018-07-29 |
dc.identifier.uri | http://hdl.handle.net/11356/1150 |
dc.description | Šolar-Clear is an adapted version of the Šolar 1.0 corpus, cf. http://hdl.handle.net/11356/1036. The Šolar(-Clear) corpus consists of texts written by students in Slovene primary and secondary schools. School essays form the majority of the corpus (64.2%) while other material includes texts created during lessons, such as text recapitulations or descriptions, examples of formal applications etc. Unlike the original Šolar corpus, Šolar-Clear only includes student texts while language corrections and other types of feedback from the teachers are not included. The corpus can thus be used for processing tasks where the inclusion of corrections hinders or complicates the procedures (e.g. for comparative data extraction, training of language models etc). |
dc.language.iso | slv |
dc.publisher | Trojina, Institute for Applied Slovene Studies |
dc.relation.replaces | http://hdl.handle.net/11356/1036 |
dc.relation.isreplacedby | http://hdl.handle.net/11356/1219 |
dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-sa/4.0/ |
dc.rights.label | PUB |
dc.source.uri | http://eng.slovenscina.eu/solar.html |
dc.subject | student writing |
dc.subject | developmental corpus |
dc.title | Developmental corpus of Slovene (without language corrections) Šolar-Clear |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
hidden | hidden |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Iztok Kosem iztok.kosem@trojina.si Trojina, Institute for Applied Slovene Studies |
sponsor | ARRS (Slovenian Research Agency) I0-0051 Centre for Applied Linguistics (CUJ) nationalFunds |
sponsor | Ministry of Culture 3340-15-141006 Upgrade of Šolar Corpus nationalFunds |
size.info | 1098107 tokens |
size.info | 2703 texts |
files.count | 1 |
files.size | 6414297 |
Datoteke v tem vnosu
To je vnos
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
z licenco:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)





- Ime
- solar-clear.zip
- Velikost
- 6.12 MB
- Format
- application/zip
- Opis
- Corpus in XML
- MD5
- b1f68ebff540a72c7a930a1c036c6584