Files in this item
Download all files in item (15.45 MB)This item is
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)





- Name
- Frequency-list-of-language-problems-from-Solar-3.0.tsv
- Size
- 14.61 MB
- Format
- Unknown
- Description
- Dataset in TSV
- MD5
- f19b3e606d0715bb2d8685d97ef9a973

- Name
- README.txt
- Size
- 2.22 KB
- Format
- Text file
- Description
- Information on the dataset in TXT
- MD5
- 8b226231bd3d27fa83c8579a52155f34
*************** SLO: Podatkovni niz vsebuje povedi z jezikovnimi napakami in popravljene povedi, kakor tudi dodatne informacije o značilnostih izvornega besedila. Za več informacij gl. vnos korpusa Šolar 3.0 na repozitoriju in priložene označevalne smernice. ENG: The dataset comprises sentences with language errors and corresponding corrected sentences, together with additional information on the text features. Please refer to the original corpus dataset and the annotaion guidelines for detailed information. Arhar Holdt, Špela; et al., 2022, Developmental corpus Šolar 3.0, Slovenian language resource repository CLARIN.SI, ISSN 2820-4042, http://hdl.handle.net/11356/1589. *************** "ID_besedila_s": SLO: ID izvornega besedila v korpusu Šolar 3.0. ENG: An ID of the source text in the Šolar 3.0 corpus. "ID_odstavka_s": SLO: ID izvornega odstavka v korpusu Šolar 3.0. ENG: An ID of the source paragraph in the Šolar 3.0 corpus. "ID_stavka_s": SLO: ID izvorne povedi v korpusu Šolar . . .

- Name
- Smernice-za-oznacevanje-korpusa-Solar_V1.1.pdf
- Size
- 856.9 KB
- Format
- Description
- Error annotation guidelines (in Slovenian)
- MD5
- c8b8b68fd1be51e1edadb7dd249b3ab4