Files in this item

 Download all files in item (15.45 MB)
Icon
Name
Frequency-list-of-language-problems-from-Solar-3.0.tsv
Size
14.61 MB
Format
Unknown
Description
Dataset in TSV
MD5
f19b3e606d0715bb2d8685d97ef9a973
 Download file
Icon
Name
README.txt
Size
2.22 KB
Format
Text file
Description
Information on the dataset in TXT
MD5
8b226231bd3d27fa83c8579a52155f34
 Download file  Preview
 File Preview  
***************

SLO: Podatkovni niz vsebuje povedi z jezikovnimi napakami in popravljene povedi, kakor tudi dodatne informacije o značilnostih izvornega besedila. Za več informacij gl. vnos korpusa Šolar 3.0 na repozitoriju in priložene označevalne smernice.
ENG: The dataset comprises sentences with language errors and corresponding corrected sentences, together with additional information on the text features. Please refer to the original corpus dataset and the annotaion guidelines for detailed information.

Arhar Holdt, Špela; et al., 2022, Developmental corpus Šolar 3.0, Slovenian language resource repository CLARIN.SI, ISSN 2820-4042, http://hdl.handle.net/11356/1589.

***************

"ID_besedila_s": SLO: ID izvornega besedila v korpusu Šolar 3.0. ENG: An ID of the source text in the Šolar 3.0 corpus.

"ID_odstavka_s": SLO: ID izvornega odstavka v korpusu Šolar 3.0. ENG: An ID of the source paragraph in the Šolar 3.0 corpus.

"ID_stavka_s": SLO: ID izvorne povedi v korpusu Šolar . . .
                                            
Icon
Name
Smernice-za-oznacevanje-korpusa-Solar_V1.1.pdf
Size
856.9 KB
Format
PDF
Description
Error annotation guidelines (in Slovenian)
MD5
c8b8b68fd1be51e1edadb7dd249b3ab4
 Download file